Which type of method is used to collect information during the passive reconnaissance?

Information Gathering and getting to know the target systems is the first process in ethical hacking. Reconnaissance is a set of processes and techniques (Footprinting, Scanning & Enumeration) used to covertly discover and collect information about a target system.

Table of Contents Show

Active Reconnaissance
Passive Reconnaissance

During reconnaissance, an ethical hacker attempts to gather as much information about a target system as possible, following the seven steps listed below −

Gather initial information
Determine the network range
Identify active machines
Discover open ports and access points
Fingerprint the operating system
Uncover services on ports
Map the network

We will discuss in detail all these steps in the subsequent chapters of this tutorial. Reconnaissance takes place in two parts − Active Reconnaissance and Passive Reconnaissance.

Active Reconnaissance

In this process, you will directly interact with the computer system to gain information. This information can be relevant and accurate. But there is a risk of getting detected if you are planning active reconnaissance without permission. If you are detected, then system admin can take severe action against you and trail your subsequent activities.

Passive Reconnaissance

In this process, you will not be directly connected to a computer system. This process is used to gather essential information without ever interacting with the target systems.

Reconnaissance is the first step of Penetration Testing after formal acceptance by a cybersecurity organization. Now the question arises, what is Reconnaissance? It is the first step where the attacker tries to gather more and more information about the environment, network-related information of the target. It is further classified into two types: Passive and Active Reconnaissance. In this article, we will cover passive reconnaissance techniques For penetration testing

Passive Reconnaissance: It is a penetration testing technique where attackers extract information related to the target without interacting with the target. That means no request has been sent directly to the target. Generally, the public resource is used to gather information. Security Experts first try to get information via passive reconnaissance.

Active Reconnaissance: It is a penetration testing technique where an attacker gets information related to the target by interacting with the target. Here, different vulnerability scanner such as Nessus, Nmap, Masscan etc. may be used to extract information. Refer this article to know more about Active Reconnaissance Tools for Penetration Testing.

In this article, we will concentrate on passive reconnaissance tools and techniques. I am listing some techniques as listed below:

(1) By using search engine: You can use google, bing and other search engines to extract information such as username, password, hidden web pages, technology, the file contains metadata, etc. You can also use the popular google hacking database which is available on https://www.exploit-db.com/google-hacking-database.

(2) Certificate Transparency (https://transparencyreport.google.com/https/certificates) - This resource can be used to identify issued certificates of targets. This will help the attacker to widen the scope of penetration testing.

(3) Guess Hostname - Use nslookup command followed by whois to get information related to the hostname. Alternatively, you can use https://www.ripe.net/ to gain the same information.

(4) Regional Internet Registries - Search online portals such AFRINIC, APNIC, ARIN, LACNIC etc. for subnets and technical contacts.

(5) Netcraft (https://sitereport.netcraft.com/): You can this website for getting information related to web server, network, SSL/TLS, hosting history, sender policy framework, etc.

(6) Platform Identification and CVE searching - Wappalyzer (https://www.wappalyzer.com/), BuiltWith (https://builtwith.com/) etc. can be used to identify technologies (programming languages, frameworks etc.) of a web application.

(7) Shodan (https://www.shodan.io/) - It can be used to identify connected IoT devices and network devices over the internet. This acts as a single point of source to provide a list of possible attack surfaces and vulnerabilites. Below is the list of queries you can use while using Shodan:

apache site:"NewYork"
OLD IIS
"iis/6.0"
city:
hostname:
port:

(8) Censys searches (https://censys.io/) : It helps in discovering exposures and entry points such as ports, whois data, etc for attackers.

(9) ExifTool by Phil Harvey: It is a command-line application for reading and writing meta information of different types of files.

(10) Find sensitive info -> search https://pastebin.com like websites to find sensitive data such as username, password, social security numbers, credit card numbers, etc.

(11) Archive.org

Sometimes an old version of a website in past gives you a lot of information. By using https://archive.org, you can see old versions of the website at different instant of history. Remember, it is not helpful in providing vulnerabilities in the current version of the website.

Automation Tools for Passive Reconnaissance

Spiderfoot (https://github.com/smicallef/spiderfoot): This helps in automation of Open Source Intelligence (OSINT). OSINT is a technique of collection of data from publicly available sources to collect IP addresses, domain names, e-mail addresses, names, and more.
theHarvester (https://github.com/laramies/theHarvester): This tool helps in identifying email, sub-domain, and other information of the target.
Discover (https://github.com/leebaird/discover): This tool automates several pen testing tasks and uses for both active and passive.
Recon-ng (https://github.com/lanmaster53/recon-ng): It is my favorite web reconnaissance framework which helps in analyzing a lot of data.
OWASP Amass - This tool provide both active and passive recon techniques. Passive recon method

Conclusion

Passive reconnaissance is helpful in increasing attack surface and identifying low hanging vulnerabilities. Comment If I miss any technique, I will update article for same.

Passive reconnaissance and active reconnaissance are the two primary forms of reconnaissance used by an attacker or pentester to assess a target before exploiting it.

An attacker would often devote up to 70% of its total penetration efforts to active or passive reconnaissance to gather as much intelligence as possible to mount a successful social engineering attack or other forms of attacks.

This article will take a deep dive into an attacker’s passive reconnaissance mindset and processes such as open-source intelligence, DNS reconnaissance, user information, and password profiling, as well as the most commonly used tools.

All the information gathered during this step is critical in preparing and executing a successful attack on a specific target. Experienced hackers never miss this step.

Kali Linux comes preinstalled with a large suite of tools used for active and passive reconnaissance. However, most of these tools are built on Python and can be used on other operating systems as long as they match the prerequisites.

Without further ado, let’s get started.

In general, passive reconnaissance is involved with evaluating publicly available information.

This information is generally obtained through various web sources or directly from the targeted organization’s employees.

During this process, the pentester or attacker does not interact directly with the target machine. No activities are logged or traced back to the attacker.

Initially, passive reconnaissance is carried out in such a way as to avoid direct contact with the target that may indicate an approaching assault or reveal the attacker’s identity.

For instance, an attacker may access the business website of a target company, read multiple pages, download documents for further investigation, etc. These interactions are considered normal behaviors and are seldom identified as a precursor to a targeted attack.

But wait! There’s more than meets the eye.

Passive reconnaissance can include other more relevant sources of intelligence gathering. Here are some of the most commonly used methods:

Open-source intelligence [OSINT].
DNS reconnaissance and route mapping [IPv4 and IPv6].
Collecting user information.
Creating user password profile.

Let’s find out what the above method is and which tools are available for each.

Also known as OSINT, the open-source intelligence gathering is usually the first step in a planned attack or penetration test.

NOTE: OSINT refers to information gathered from publicly available sources, most notably the Internet.

The quantity of open-source information accessible is substantial. Most intelligence and military organizations actively collect OSINT to gather information on their targets while preventing potential leaks about their identity.

The primary determines the primary purpose of an attack or penetration test purpose behind it.

If a social engineering technique is used, they may enhance this information with facts that lend credence to the demand for information.

The target’s official online presence assessment is generally the first step in acquiring OSINT (website, blogs, social media pages, and third-party data repositories such as public financial records). The following are some items of interest:

Employee information such as names, contact details – phone number, address, e-mail, etc.
Suppliers for business partners with potential access to the target’s network.
Geographical locations of a company’s branches that share sensitive information but may lack security measures.
An overview of the parent firm and its subsidiaries, particularly any new firms acquired via mergers or acquisitions. These companies are often not as secure as the parent company.
Insights into business culture and language can aid in social engineering attacks.
Current technologies. For example, suppose the target releases a press release announcing the adoption of new devices or software. In that case, the attacker will go through the vendor’s website looking for bug complaints, known or suspected vulnerabilities, and information that might enable other types of attacks.
Using search engines such as Google or Bing to find specific information mistakenly leaked by the company’s employees. Typing the search term “company name” + password filetype:xls on Google may bring up an Excel spreadsheet that may contain sensitive information. These search terms are referred to as google dorks or google hacking.

NOTE: most search engines have since released APIs to facilitate automated lookups, making tools such as Maltego particularly effective.

Other online sources for passive reconnaissance may include:

Usenet newsgroups, especially messages from target workers seeking assistance with certain technologies.
LinkedIn and other similar job websites that give information on the job openings, particularly those for technical positions, with job descriptions that include a list of the technologies and services that a good candidate must have
Historic or cached material obtained by search engines (cache:URL in Google, or Internet archive such as Wayback Machine).

Keeping track of all findings may be challenging. Fortunately, tools such as KeepNote, which allow for the quick import and maintenance of many sorts of data. KeepNote is available for Windows, macOS, Linux and it comes preinstalled on Kali Linux.

The second step in passive reconnaissance is to determine the target’s IP [IPv4 or IPv6] addresses and routes.

DNS reconnaissance is concerned with determining who owns a particular domain or set of IP addresses (whois-type information), DNS information describing the actual domain names and IP addresses allocated to the target, and the path between the penetration tester or attacker the end target.

Some of the DNS information comes from open sources, while some come from third parties like DNS registrars. Although the registrar may gather IP addresses and data about the attacker’s requests, it is seldom shared with the ultimate target.

Information that the target might directly monitor, such as DNS server logs, is generally never inspected or kept.

NOTE: It’s important to remember that DNS information might include old or inaccurate entries. Cross-validate data using multiple tools and multiple source servers to reduce false information. Review the results and double-check any non-relevant findings.

Here are some commonly used methods and tools used for DNS reconnaissance and route mapping:

1. WHOIS

Identifying the addresses allocated to the target site is the first step in exploring the IP address space.

The whois command, which enables individuals to query databases that maintain information on the registered users of an Internet resource, such as a domain name or IP address, is generally used to achieve this.

A whois query may return names, physical addresses, phone numbers, and e-mail addresses (which may be valuable in social engineering attacks), as well as IP addresses and DNS server names, depending on the database searched.

NOTE: third parties [Akamai, AWS, Cloudflare, etc.] are increasingly being used to protect this data, and whois information for domains such as .mil and .gov, may not be publicly available.

The majority of requests to these sites are recorded. There are various internet lists that detail government-assigned domains and IP addresses; most tools permit no contact addresses, and government domains should be registered into these lists to prevent unwanted attention.

The most convenient way to perform a whois inquiry on a target is via command prompt using the whois <target IP or domain name> command, as seen in Figure 1.1.

Figure 1.1: Passive reconnaissance – whois nudesystems.com

The whois command output can include information that can be used for social engineering attack purposes such as registrar, email, phone numbers, etc.

DNS [Domain Name Service] is a distributed database that maps names to IP addresses, e.g., nudesystems.com, to 104.21.65.47.

Attackers mainly use the DNS information gathered to:

Launch brute-force attacks to discover new domain names linked to the target.
Find service records (SRV) that include information about the service, protocol, port, and priority of services.
Locating potentially vulnerable services (RDP, FTP, etc.).
Identifying servers that are misconfigured or are not updated with the latest patches.
Spam e-mails are controlled using the DomainKeys Identified Mail (DKIM) and Sender Policy Framework (SPF) records.
If the DNS server is set up to allow any requester to transfer a zone, it will return hostnames and IP addresses for the machines connected to the Internet, making it simpler to spot possible targets. For example, a zone transfer might reveal the hostnames and IP addresses of internal devices if the destination does not separate public (external) DNS information from private (internal) DNS information.

Essential command tools for DNS lookup such as nslookup are available on Windows and Linux/UNIX operating systems. On Linux/UNIX systems, there is an alternative command-line tool called dig [Figure 1.2].

Figure 1.2: Passive reconnaissance – dig nudesystems.com

Unfortunately, both nslookup and dig commands can only query one machine at a time. Kali Linux offers various tools for iteratively querying DNS information for a specific target for IPv4 and IPv6 addresses.

The IP address, or Internet Protocol address, is a numerical identifier for devices linked to a private or public Internet. The Internet nowadays is mostly built on IPv4. As seen in the table below, Kali contains numerous command-line tools to aid DNS reconnaissance.

Command	Description
dnsrecon dnsmap dnsenum	Used for DNS record enumeration (A, MX, wildcard, TXT, etc.), Google lookup, subdomain brute-force attacks, reverse lookup, zone transfer, and zone walking. dnsrecon is the recommended option due to producing well-phrased results, and the data can be imported easily in the Metasploit Framework.
dnswalk	Used to assess the DNS information for internal consistency and accuracy of data.
dnstracer	Used to find where a particular Domain Name System obtains its information and traces the chain of DNS servers back to the servers that recognize the info.
fierce	Used to locate non-contiguous IP space and hostnames against given domains by triggering zone transfers followed by brute-force DNS attacks to obtain DNS information.

Table 1.1: Passive reconnaissance – Kali Linux DNS reconnaissance.

TIP: use fierce command first to ensure that all probable targets have been discovered, followed by dnsrecon and dnsenum to cross-validation of the captured DNS data.

The following capture shows dnsrecon generating a standard DNS and SRV search records search. As you can see, the SRV records for nudesystems.com are not publicly available [Figure 1.3].

Figure 1.3: Passive reconnaissance – IPv4 DNS and SRV record search for nudesystems.com.

While IPv4 seems to provide for a wide address space, freely accessible IP addresses were exhausted some years ago, necessitating the use of NAT and DHCP to boost the number of accessible IPv4 addresses.

The implementation of an enhanced IP addressing method, IPv6, has provided a more lasting solution. Although it accounts for fewer than 5% of Internet addresses, its use is growing, and penetration testers must be prepared to deal with the variations between IPv4 and IPv6 addresses.

NOTE: IPv6 source and destination addresses are 128 bits long, resulting in 2128 potential addresses, or 340 undecillion possibilities.

Kali Linux contains a number of tools designed to take advantage of IPv6 addressing such as Nmap which supports IPv6 as seen in Table 1.2 below.

Command [Kali Linux]	Description
dnsrevenum6	Performs a reverse DNS enumeration for a given IPv6 address. Available here.
dnsdict6	Enumerates the subdomains of a parent domain and get IPv4 and IPv6 addresses using a brute force search based on its own internal list or provided dictionary list. Available here.

Table 1.2: Passive reconnaissance – DNS tools for IPv6 reconnaissance.

Route mapping was developed as a diagnostic tool for seeing the path that an IP packet takes from one host to the next.

Using the Time to Live (TTL) field in an IP packet, each hop from one point to the next causes the receiving router to send an ICMP TIME EXCEEDED message, decreasing the value in the TTL field by one.

The packets keep track of the number of hops and the path followed. The traceroute data provides the following critical information to an attacker or pentester:

The number of hope between an attacker and the target
Identify accessing control devices (firewalls or routers) that may filter the traffic generated by the attacker.
Details about the external topology of the network
Identify internal addressing if the network is misconfigured.

On Windows, tracert command-line is a utility that uses ICMP packets to map the route between an attacker and a target.

As seen in Figure 1.4, the tracert command on Windows will show the complete path [unfiltered] output.

Figure 1.4: Passive reconnaissance – tracert command on Windows 10.

On Linux/UNIX, the utility is called traceroute [Figure 1.5]. If the traceroute command is triggered from Kali Linux, most of the hopes between source to destination will be filtered [* * *].

Table 1.5: Passive reconnaissance – traceroute command on Kali Linux.

Therefore, in Kali Linux, there are some additional tools to conduct route traces without filtering the output, as seen in Table 1.3.

Command [Kali Linux]	Description
hping3	A TCP/IP packet assembler and analyzer with a ping-like interface supporting TCP/UDP/ICMP protocols.
trace6	A traceroute utility that uses ICMPV6
intrace	A tool that allows users to enumerate IP hops by taking use of existing TCP connections, which may originate from the local system, network, or local hosts. This makes it ideal for getting beyond external filters like firewalls.

Table 1.3: Passive reconnaissance – Kali Linux traceroute tools.

Because of the control over packet type, source packet, and destination packet, hping3 is one of the most valuable tools around providing a lot more options than the traditional ping. hping3 comes preinstalled on Kali Linux 2021.x.

Figure 1.6: Passive reconnaissance – hping3.

Many penetration testers collect usernames and email addresses since they are regularly used for logging on to a target device. The browser is the most utilized tool for manually searching an organization’s website and third-party sites like, e.g., LinkedIn.

Many companies do not correctly deactivate employee accounts after they leave a firm. Thus these credentials may potentially provide access to the target system.

When executing social engineering attacks, forwarding information requests to a former employee frequently results in a redirect, which provides the attacker with the “credibility” of having interacted with the prior employee.

theHarvester tool is a Python script, which looks for email addresses, hosts, and subdomains via major search engines. theHarvester is quite easy to use as a few command-line switches need to be configured to get it running. Table 1.4 shows some of the most used options available:

Command Switch [theHarvester]	Description
-d	Specify the domain [website] to be searched.
-b	Specify the source for data extraction. Must be: Google, Google-Profiles, Bing, BingAPI, LinkedIn, People123, Jigsaw, PGP,or all above.
-l	Specify the limit of data harvested from a number of search results
-f	Save the results to a file [XML, HTTP]. Without this switch, the result will be displayed and not saved.

Table 1.4: Passive reconnaissance – The Harvester.

theHarvester requires Python 3.7+ and is available on Linux/UNIX systems. theHarvester comes pre-installed in Kali Linux. Here is an example of theHarvester command against nudesystems.com using Google search.

NOTE: the command theharvester is deprecated and was replaced with theHarvester in the newest releases [Figure 1.7]

Figure 1.7: Passive reconnaissance – The Harvester [Kali Linux].

Here is a dedicated post on how to install and use theHarvester in no time.

The end-user cannot see metadata directly. Hence most documents are released with their intact metadata. Unfortunately, this data leakage can divulge valuable information to support an attack.

Testers and attackers may at least collect user names using the comparison to documents, identify people related to certain sorts of data, e.g., yearly financial reports, technical documentation, etc.

The risks associated with geolocation information have been growing as mobile devices become increasingly ubiquitous. Attackers are looking for less secure places [hotels and restaurants, airports, etc.] to begin an attack on users working outside a company perimeter.

Here are some examples of metadata usually attached to documents:

The author of the document.
The owner of the application used to create the document.
The timestamp of the document creation.
Files created via mobile phone or digital cameras may contain geographical tags.
The timestamp of when the document was modified.

Metagoofil is an OSINT Python script that uses Google search to scan a specific website and extracts various information from documents. The supported file extensions are pdf, pptx, doc, docx, xls, xlsx.

Metagoofil will download the specified number of documents into a temporary folder. Subsequently, the information is extracted and organized.

Here is an example of Metagoofil in action in Kali Linux using the Microsoft website to scan and download the available .docx files [Figure 1.8].

Figure 1.8: Passive reconnaissance – Metagoofil [Kali Linux].

NOTE: Metagoofil can be installed in Kali using the command: sudo apt install metagoofil

In the final step in the passive reconnaissance process, an attacker or pentester will attempt to create user-specific passwords based on the intelligence collected so far.

If done manually, this process may take a long time, and the result is not always guaranteed. Furthermore, an application requires you to try every password in the list one at a time to assess if it is working or not.

An alternative option is to use Common User Password Profiler (CUPP) to generate a customized wordlist. CUPP is a python 3 script that can be downloaded and installed from the CUPP official GitHub repository here.

NOTE: On Kali Linux, CUPP can be installed using the command: sudo apt install cupp

Figure 1.9 below shows a list of 6624 generated passwords using information such as the target name, spouse name, pet name, dates of birth, specific keywords, etc.

Figure 1.9: Passive reconnaissance – CUPP [Kali Linux].

The list of generated passwords can be found in the CUPP directory on your computer.

In conclusion, the aim of passive reconnaissance is to evaluate data freely accessible to the public.

Due to the non-threatening nature of this method, the attacker’s actions or IP address are practically indistinguishable from a regular.

This knowledge may be crucial for instrumenting social engineering attacks or other forms of sophisticated penetration strategies.

I hope this article sheds some light on what passive reconnaissance is, the process behind it, and the tools that are most commonly used by attackers or penetration testers to get relevant information on a target.

If you are looking for more tutorials for various tools used for active and passive reconnaissance, check the Cybersecurity section of this website.

See you there!

Leonard Cucos

Leonard Cucos is an engineer with over 20 years of IT/Telco experience managing large UNIX/Linux-based server infrastructures, IP and Optics core networks, Information Security [red/blue], Data Science, and FinTech.