Information Gathering and getting to know the target systems is the first process in ethical hacking. Reconnaissance is a set of processes and techniques (Footprinting, Scanning & Enumeration) used to covertly discover and collect information about a target system. During reconnaissance, an ethical hacker attempts to gather as much information about a target system as possible, following the seven steps listed below −
We will discuss in detail all these steps in the subsequent chapters of this tutorial. Reconnaissance takes place in two parts − Active Reconnaissance and Passive Reconnaissance. Active ReconnaissanceIn this process, you will directly interact with the computer system to gain information. This information can be relevant and accurate. But there is a risk of getting detected if you are planning active reconnaissance without permission. If you are detected, then system admin can take severe action against you and trail your subsequent activities. Passive ReconnaissanceIn this process, you will not be directly connected to a computer system. This process is used to gather essential information without ever interacting with the target systems.
Reconnaissance is the first step of Penetration Testing after formal acceptance by a cybersecurity organization. Now the question arises, what is Reconnaissance? It is the first step where the attacker tries to gather more and more information about the environment, network-related information of the target. It is further classified into two types: Passive and Active Reconnaissance. In this article, we will cover passive reconnaissance techniques For penetration testing Passive Reconnaissance: It is a penetration testing technique where attackers extract information related to the target without interacting with the target. That means no request has been sent directly to the target. Generally, the public resource is used to gather information. Security Experts first try to get information via passive reconnaissance. Active Reconnaissance: It is a penetration testing technique where an attacker gets information related to the target by interacting with the target. Here, different vulnerability scanner such as Nessus, Nmap, Masscan etc. may be used to extract information. Refer this article to know more about Active Reconnaissance Tools for Penetration Testing. In this article, we will concentrate on passive reconnaissance tools and techniques. I am listing some techniques as listed below: (1) By using search engine: You can use google, bing and other search engines to extract information such as username, password, hidden web pages, technology, the file contains metadata, etc. You can also use the popular google hacking database which is available on https://www.exploit-db.com/google-hacking-database. (2) Certificate Transparency (https://transparencyreport.google.com/https/certificates) - This resource can be used to identify issued certificates of targets. This will help the attacker to widen the scope of penetration testing. (3) Guess Hostname - Use nslookup command followed by whois to get information related to the hostname. Alternatively, you can use https://www.ripe.net/ to gain the same information. (4) Regional Internet Registries - Search online portals such AFRINIC, APNIC, ARIN, LACNIC etc. for subnets and technical contacts. (5) Netcraft (https://sitereport.netcraft.com/): You can this website for getting information related to web server, network, SSL/TLS, hosting history, sender policy framework, etc. (6) Platform Identification and CVE searching - Wappalyzer (https://www.wappalyzer.com/), BuiltWith (https://builtwith.com/) etc. can be used to identify technologies (programming languages, frameworks etc.) of a web application. (7) Shodan (https://www.shodan.io/) - It can be used to identify connected IoT devices and network devices over the internet. This acts as a single point of source to provide a list of possible attack surfaces and vulnerabilites. Below is the list of queries you can use while using Shodan:
(8) Censys searches (https://censys.io/) : It helps in discovering exposures and entry points such as ports, whois data, etc for attackers. (9) ExifTool by Phil Harvey: It is a command-line application for reading and writing meta information of different types of files. (10) Find sensitive info -> search https://pastebin.com like websites to find sensitive data such as username, password, social security numbers, credit card numbers, etc. (11) Archive.org Sometimes an old version of a website in past gives you a lot of information. By using https://archive.org, you can see old versions of the website at different instant of history. Remember, it is not helpful in providing vulnerabilities in the current version of the website. Automation Tools for Passive Reconnaissance
Conclusion Passive reconnaissance is helpful in increasing attack surface and identifying low hanging vulnerabilities. Comment If I miss any technique, I will update article for same. Passive reconnaissance and active reconnaissance are the two primary forms of reconnaissance used by an attacker or pentester to assess a target before exploiting it. An attacker would often devote up to 70% of its total penetration efforts to active or passive reconnaissance to gather as much intelligence as possible to mount a successful social engineering attack or other forms of attacks. This article will take a deep dive into an attacker’s passive reconnaissance mindset and processes such as open-source intelligence, DNS reconnaissance, user information, and password profiling, as well as the most commonly used tools. All the information gathered during this step is critical in preparing and executing a successful attack on a specific target. Experienced hackers never miss this step. Kali Linux comes preinstalled with a large suite of tools used for active and passive reconnaissance. However, most of these tools are built on Python and can be used on other operating systems as long as they match the prerequisites. Without further ado, let’s get started. In general, passive reconnaissance is involved with evaluating publicly available information. This information is generally obtained through various web sources or directly from the targeted organization’s employees. During this process, the pentester or attacker does not interact directly with the target machine. No activities are logged or traced back to the attacker. Initially, passive reconnaissance is carried out in such a way as to avoid direct contact with the target that may indicate an approaching assault or reveal the attacker’s identity. For instance, an attacker may access the business website of a target company, read multiple pages, download documents for further investigation, etc. These interactions are considered normal behaviors and are seldom identified as a precursor to a targeted attack. But wait! There’s more than meets the eye. Passive reconnaissance can include other more relevant sources of intelligence gathering. Here are some of the most commonly used methods:
Let’s find out what the above method is and which tools are available for each. Also known as OSINT, the open-source intelligence gathering is usually the first step in a planned attack or penetration test.
The quantity of open-source information accessible is substantial. Most intelligence and military organizations actively collect OSINT to gather information on their targets while preventing potential leaks about their identity. The primary determines the primary purpose of an attack or penetration test purpose behind it. If a social engineering technique is used, they may enhance this information with facts that lend credence to the demand for information. The target’s official online presence assessment is generally the first step in acquiring OSINT (website, blogs, social media pages, and third-party data repositories such as public financial records). The following are some items of interest:
Other online sources for passive reconnaissance may include:
Keeping track of all findings may be challenging. Fortunately, tools such as KeepNote, which allow for the quick import and maintenance of many sorts of data. KeepNote is available for Windows, macOS, Linux and it comes preinstalled on Kali Linux. The second step in passive reconnaissance is to determine the target’s IP [IPv4 or IPv6] addresses and routes. DNS reconnaissance is concerned with determining who owns a particular domain or set of IP addresses (whois-type information), DNS information describing the actual domain names and IP addresses allocated to the target, and the path between the penetration tester or attacker the end target. Some of the DNS information comes from open sources, while some come from third parties like DNS registrars. Although the registrar may gather IP addresses and data about the attacker’s requests, it is seldom shared with the ultimate target. Information that the target might directly monitor, such as DNS server logs, is generally never inspected or kept.
Here are some commonly used methods and tools used for DNS reconnaissance and route mapping: 1. WHOISIdentifying the addresses allocated to the target site is the first step in exploring the IP address space. The whois command, which enables individuals to query databases that maintain information on the registered users of an Internet resource, such as a domain name or IP address, is generally used to achieve this. A whois query may return names, physical addresses, phone numbers, and e-mail addresses (which may be valuable in social engineering attacks), as well as IP addresses and DNS server names, depending on the database searched.
The majority of requests to these sites are recorded. There are various internet lists that detail government-assigned domains and IP addresses; most tools permit no contact addresses, and government domains should be registered into these lists to prevent unwanted attention. The most convenient way to perform a whois inquiry on a target is via command prompt using the whois <target IP or domain name> command, as seen in Figure 1.1. Figure 1.1: Passive reconnaissance – whois nudesystems.comThe whois command output can include information that can be used for social engineering attack purposes such as registrar, email, phone numbers, etc. DNS [Domain Name Service] is a distributed database that maps names to IP addresses, e.g., nudesystems.com, to 104.21.65.47. Attackers mainly use the DNS information gathered to:
Essential command tools for DNS lookup such as nslookup are available on Windows and Linux/UNIX operating systems. On Linux/UNIX systems, there is an alternative command-line tool called dig [Figure 1.2]. Figure 1.2: Passive reconnaissance – dig nudesystems.comUnfortunately, both nslookup and dig commands can only query one machine at a time. Kali Linux offers various tools for iteratively querying DNS information for a specific target for IPv4 and IPv6 addresses. The IP address, or Internet Protocol address, is a numerical identifier for devices linked to a private or public Internet. The Internet nowadays is mostly built on IPv4. As seen in the table below, Kali contains numerous command-line tools to aid DNS reconnaissance.
The following capture shows dnsrecon generating a standard DNS and SRV search records search. As you can see, the SRV records for nudesystems.com are not publicly available [Figure 1.3]. While IPv4 seems to provide for a wide address space, freely accessible IP addresses were exhausted some years ago, necessitating the use of NAT and DHCP to boost the number of accessible IPv4 addresses. The implementation of an enhanced IP addressing method, IPv6, has provided a more lasting solution. Although it accounts for fewer than 5% of Internet addresses, its use is growing, and penetration testers must be prepared to deal with the variations between IPv4 and IPv6 addresses.
Kali Linux contains a number of tools designed to take advantage of IPv6 addressing such as Nmap which supports IPv6 as seen in Table 1.2 below.
Route mapping was developed as a diagnostic tool for seeing the path that an IP packet takes from one host to the next. Using the Time to Live (TTL) field in an IP packet, each hop from one point to the next causes the receiving router to send an ICMP TIME EXCEEDED message, decreasing the value in the TTL field by one. The packets keep track of the number of hops and the path followed. The traceroute data provides the following critical information to an attacker or pentester:
On Windows, tracert command-line is a utility that uses ICMP packets to map the route between an attacker and a target. As seen in Figure 1.4, the tracert command on Windows will show the complete path [unfiltered] output. Figure 1.4: Passive reconnaissance – tracert command on Windows 10.On Linux/UNIX, the utility is called traceroute [Figure 1.5]. If the traceroute command is triggered from Kali Linux, most of the hopes between source to destination will be filtered [* * *]. Table 1.5: Passive reconnaissance – traceroute command on Kali Linux.Therefore, in Kali Linux, there are some additional tools to conduct route traces without filtering the output, as seen in Table 1.3.
Because of the control over packet type, source packet, and destination packet, hping3 is one of the most valuable tools around providing a lot more options than the traditional ping. hping3 comes preinstalled on Kali Linux 2021.x. Figure 1.6: Passive reconnaissance – hping3.Many penetration testers collect usernames and email addresses since they are regularly used for logging on to a target device. The browser is the most utilized tool for manually searching an organization’s website and third-party sites like, e.g., LinkedIn. Many companies do not correctly deactivate employee accounts after they leave a firm. Thus these credentials may potentially provide access to the target system. When executing social engineering attacks, forwarding information requests to a former employee frequently results in a redirect, which provides the attacker with the “credibility” of having interacted with the prior employee. theHarvester tool is a Python script, which looks for email addresses, hosts, and subdomains via major search engines. theHarvester is quite easy to use as a few command-line switches need to be configured to get it running. Table 1.4 shows some of the most used options available:
theHarvester requires Python 3.7+ and is available on Linux/UNIX systems. theHarvester comes pre-installed in Kali Linux. Here is an example of theHarvester command against nudesystems.com using Google search. Figure 1.7: Passive reconnaissance – The Harvester [Kali Linux]. Here is a dedicated post on how to install and use theHarvester in no time. The end-user cannot see metadata directly. Hence most documents are released with their intact metadata. Unfortunately, this data leakage can divulge valuable information to support an attack. Testers and attackers may at least collect user names using the comparison to documents, identify people related to certain sorts of data, e.g., yearly financial reports, technical documentation, etc. The risks associated with geolocation information have been growing as mobile devices become increasingly ubiquitous. Attackers are looking for less secure places [hotels and restaurants, airports, etc.] to begin an attack on users working outside a company perimeter. Here are some examples of metadata usually attached to documents:
Metagoofil is an OSINT Python script that uses Google search to scan a specific website and extracts various information from documents. The supported file extensions are pdf, pptx, doc, docx, xls, xlsx. Metagoofil will download the specified number of documents into a temporary folder. Subsequently, the information is extracted and organized. Here is an example of Metagoofil in action in Kali Linux using the Microsoft website to scan and download the available .docx files [Figure 1.8]. Figure 1.8: Passive reconnaissance – Metagoofil [Kali Linux].
In the final step in the passive reconnaissance process, an attacker or pentester will attempt to create user-specific passwords based on the intelligence collected so far. If done manually, this process may take a long time, and the result is not always guaranteed. Furthermore, an application requires you to try every password in the list one at a time to assess if it is working or not. An alternative option is to use Common User Password Profiler (CUPP) to generate a customized wordlist. CUPP is a python 3 script that can be downloaded and installed from the CUPP official GitHub repository here.
Figure 1.9 below shows a list of 6624 generated passwords using information such as the target name, spouse name, pet name, dates of birth, specific keywords, etc. Figure 1.9: Passive reconnaissance – CUPP [Kali Linux].The list of generated passwords can be found in the CUPP directory on your computer. In conclusion, the aim of passive reconnaissance is to evaluate data freely accessible to the public. Due to the non-threatening nature of this method, the attacker’s actions or IP address are practically indistinguishable from a regular. This knowledge may be crucial for instrumenting social engineering attacks or other forms of sophisticated penetration strategies. I hope this article sheds some light on what passive reconnaissance is, the process behind it, and the tools that are most commonly used by attackers or penetration testers to get relevant information on a target. If you are looking for more tutorials for various tools used for active and passive reconnaissance, check the Cybersecurity section of this website. See you there!
Leonard CucosLeonard Cucos is an engineer with over 20 years of IT/Telco experience managing large UNIX/Linux-based server infrastructures, IP and Optics core networks, Information Security [red/blue], Data Science, and FinTech. |