An Introduction to OWASP Amass 4 - Part 6 - The Mysterious case of the Datasource list
The OWASP Amass project is an open-source, actively developed security tool with extensive community support that focuses on information gathering and reconnaissance. It helps security researchers and penetration testers discover and map the attack surface of their target networks by using a variety of data sources. Whether you are a penetration tester, an auditor, a security researcher or the CISO/IT manager, you have several valid reasons for mapping out the external attack surface of an organisation. This process is also referred to as reconnaissance or information gathering.
Version 4 is a major revision of Amass. If you are familiar with earlier versions then you will need to change your approach to understand how it is organized and how this "framework" works.
In this instalment in our series on OWASP Amass version 4 we take a closer look at data sources and some curious behaviours. This is part 6 of the series. Part 1, is an introduction to the Amass GitHub, Part 2 discusses the data model and the approach to configuration in your workflow, and Part 3 explains a Postgres database setup, and Part 4 explains installation of the CLI tool. In Part 5 we introduced configuration and had our first run of the amass enum
command.
Overview
The Amass ecosystem can run on its own without external data sources. However, results can be enhanced and improved by the use of external data sources such as Shodan and Censys. The data sources configuration file provides a YAML format system to configure credentials and API Keys for any accounts you may have on these external data sources. However how do you know the configured sources are working?
Logging
The first tool we will look at when we run a command such as amass enum
is the amass.log file. In our case this file will reside in $HOME/.config/amass
. Below is an example output from our first amass run.
┌──(user㉿kanga)-[~/.config/amass]
└─$ more amass.log
14:58:49.849349 360PassiveDNS: check callback failed for the configuration
14:58:49.849349 ZoomEye: check callback failed for the configuration
14:58:49.858920 BigDataCloud: check callback failed for the configuration
14:58:49.858985 Ahrefs: check callback failed for the configuration
14:58:49.858995 C99: check callback failed for the configuration
14:58:49.858998 CIRCL: check callback failed for the configuration
What does this mean?
The message “check callback failed for the configuration
” informs you that those data sources will not be used. Each supporting module has a check() routine which the Amass ecosystem queries. In these cases the check() function returned failure.
You can determine the external sources it is querying from the same log file:
14:58:49.877719 Querying BeVigil for owasp.org subdomains
14:58:49.877734 Querying Pulsedive for owasp.org subdomains
14:58:49.877481 Querying GitHub for owasp.org subdomains
14:58:49.877830 Querying Google for owasp.org subdomains
14:58:49.877487 Querying ThreatMiner for owasp.org subdomains
14:58:49.877901 Querying DNSHistory for owasp.org subdomains
14:58:49.877752 Querying AbuseIPDB for owasp.org subdomains
Listing Data Sources
Amass has a parameter that inform you what data sources are available. However, if you run it against the default (no API parameters) data source file it will indicate that some data sources are available.
──(user㉿kanga)-[~/.config/amass]
└─$ amass intel --list -config ./amass-config-owasp.yaml
Data Source | Type | Available
---------------------------------------------------------------------
360PassiveDNS api
ASNLookup api
AbuseIPDB scrape *
Active Crawl crawl *
-------------------------8<----------------------------
or alternatively:
amass enum --list -config <CONFIG_PATH>
You may wonder how, if no API keys are provided why do some data sources indicate as available? This may be do to:
”Some could be marked incorrectly in the list when the integration script does not have a properly implemented check
function” - Jeff Foley, discord, 01/17/2023 10:32 AM
It may also be due to modules that communicate with services that do not require an account.
The -list
command is interesting, but not wholly useful at the moment. The amass.log
file is the clear source of truth for what datasources are available and active.
Setting up a Data Source
As mentioned the format of the datasources file is YAML. The format is of the form:
name: <DataSourceName>
ttl: <a number>
creds:
account:
apikey: <api key>
secret: <account provided secret>
username: <account username>
password: <account password>
Here are what these parts mean:
name. The name of your data source.
ttl. The number of minutes that the response of the data source for the target is cached. The default in the original example file is fine here.
creds/account. The account section provides 4 possible methods to identify the data source: apikey, username, secret, and password. Enter as many as you require for access to your account. This is obviously determined by the source. Often its just an apikey.
Wrap Up
In this instalment we covered configuring a few more details on configuring Amass datasources. We discussed the content of the data sources scheme and some ways to view data sources in operation using the logging file and the list operator.