An Introduction to OWASP Amass 4 - Part 6 - The Mysterious case of the Datasource list

The OWASP Amass project is an open-source, actively developed security tool with extensive community support that focuses on information gathering and reconnaissance. It helps security researchers and penetration testers discover and map the attack surface of their target networks by using a variety of data sources. Whether you are a penetration tester, an auditor, a security researcher or the CISO/IT manager, you have several valid reasons for mapping out the external attack surface of an organisation. This process is also referred to as reconnaissance or information gathering.

Version 4 is a major revision of Amass. If you are familiar with earlier versions then you will need to change your approach to understand how it is organized and how this "framework" works.

In this instalment in our series on OWASP Amass version 4 we take a closer look at data sources and some curious behaviours. This is part 6 of the series. Part 1, is an introduction to the Amass GitHub, Part 2 discusses the data model and the approach to configuration in your workflow, and Part 3 explains a Postgres database setup, and Part 4 explains installation of the CLI tool. In Part 5 we introduced configuration and had our first run of the amass enum command.

Overview

The Amass ecosystem can run on its own without external data sources. However, results can be enhanced and improved by the use of external data sources such as Shodan and Censys. The data sources configuration file provides a YAML format system to configure credentials and API Keys for any accounts you may have on these external data sources. However how do you know the configured sources are working?

Logging

The first tool we will look at when we run a command such as amass enum is the amass.log file. In our case this file will reside in $HOME/.config/amass. Below is an example output from our first amass run.

┌──(user㉿kanga)-[~/.config/amass]
└─$ more amass.log
14:58:49.849349 360PassiveDNS: check callback failed for the configuration
14:58:49.849349 ZoomEye: check callback failed for the configuration
14:58:49.858920 BigDataCloud: check callback failed for the configuration
14:58:49.858985 Ahrefs: check callback failed for the configuration
14:58:49.858995 C99: check callback failed for the configuration
14:58:49.858998 CIRCL: check callback failed for the configuration

What does this mean?


The message “check callback failed for the configuration” informs you that those data sources will not be used. Each supporting module has a check() routine which the Amass ecosystem queries. In these cases the check() function returned failure.

You can determine the external sources it is querying from the same log file:

14:58:49.877719 Querying BeVigil for owasp.org subdomains
14:58:49.877734 Querying Pulsedive for owasp.org subdomains
14:58:49.877481 Querying GitHub for owasp.org subdomains
14:58:49.877830 Querying Google for owasp.org subdomains
14:58:49.877487 Querying ThreatMiner for owasp.org subdomains
14:58:49.877901 Querying DNSHistory for owasp.org subdomains
14:58:49.877752 Querying AbuseIPDB for owasp.org subdomains

Listing Data Sources

Amass has a parameter that inform you what data sources are available. However, if you run it against the default (no API parameters) data source file it will indicate that some data sources are available.

──(user㉿kanga)-[~/.config/amass]
└─$ amass intel --list -config ./amass-config-owasp.yaml
Data Source               | Type                    | Available
---------------------------------------------------------------------
360PassiveDNS               api                         
ASNLookup                   api                         
AbuseIPDB                   scrape                      *
Active Crawl                crawl                       *
-------------------------8<----------------------------

or alternatively:

amass enum --list -config <CONFIG_PATH>

You may wonder how, if no API keys are provided why do some data sources indicate as available? This may be do to:

”Some could be marked incorrectly in the list when the integration script does not have a properly implemented check function” - Jeff Foley, discord, 01/17/2023 10:32 AM

It may also be due to modules that communicate with services that do not require an account.

The -list command is interesting, but not wholly useful at the moment. The amass.log file is the clear source of truth for what datasources are available and active.

Setting up a Data Source

As mentioned the format of the datasources file is YAML. The format is of the form:

name: <DataSourceName>
ttl: <a number>
creds:
  account:
    apikey: <api key>
    secret: <account provided secret>
    username: <account username>
    password: <account password>

Here are what these parts mean:

  • name. The name of your data source.

  • ttl. The number of minutes that the response of the data source for the target is cached. The default in the original example file is fine here.

  • creds/account. The account section provides 4 possible methods to identify the data source: apikey, username, secret, and password. Enter as many as you require for access to your account. This is obviously determined by the source. Often its just an apikey.

Wrap Up

In this instalment we covered configuring a few more details on configuring Amass datasources. We discussed the content of the data sources scheme and some ways to view data sources in operation using the logging file and the list operator.









Previous
Previous

An Introduction to OWASP Amass 4 - Part 7 - In Depth Subdomain Enumeration and Network Mapping

Next
Next

An Introduction to OWASP Amass 4 - Part 5 - Configuration