Passive DNS Capture/Monitoring Framework

Build Status Go Version Latest Version License Open Issues

Logo

Table of Contents

DNS Monster

Passive DNS monitoring framework built on Golang. dnsmonster implements a packet sniffer for DNS traffic. It can accept traffic from a pcap file, a live interface or a dnstap socket, and can be used to index and store hundreds of thousands of DNS queries per second as it has shown to be capable of indexing 200k+ DNS queries per second on a commodity computer. It aims to be scalable, simple and easy to use, and help security teams to understand the details about an enterprise's DNS traffic. dnsmonster does not look to follow DNS conversations, rather it aims to index DNS packets as soon as they come in. It also does not aim to breach the privacy of the end-users, with the ability to mask Layer 3 IPs (IPv4 and IPv6), enabling teams to perform trend analysis on aggregated data without being able to trace back the queries to an individual. Blogpost

The code before version 1.x is considered beta quality and is subject to breaking changes. Please check the release notes for each tag to see the list of breaking scenarios between each release, and how to mitigate potential data loss.

Inner logic of dnsmonster

Main features

  • Can use Linux's afpacket and zero-copy packet capture.
  • Supports BPF
  • Can mask IP to enhance privacy
  • Can have a pre-processing sampling ratio
  • Can have a list of "skip" fqdns to avoid writing some domains/suffix/prefix to storage, thus improving DB performance
  • Can have a list of "allow" domains to only log hits of certain domains in Clickhouse/Stdout/File
  • Modular output with different logic per output stream. See Supported Outputs
  • Hot-reload of skip and allow domain files
  • Automatic data retention policy using ClickHouse's TTL attribute
  • Simple Grafana dashboard for Clickhouse tables
  • Can be shipped as a single, statically-linked binary
  • Ability to be configured using Env variables, command line options or configuration file
  • Ability to sample output metrics using ClickHouse's SAMPLE capability
  • High compression ratio thanks to ClickHouse's built-in LZ4 storage
  • Supports DNS Over TCP, Fragmented DNS (udp/tcp) and IPv6
  • Supports dnstrap over Unix socket or TCP

Installation

Linux

Best way to get started with dnsmonster is to download the binary from the release section. The binary is statically built against musl, hence it should work out of the box for many distros. For afpacket support, you must use kernel 3.x+. Any modern Linux distribution (CentOS/RHEL 7+, Ubuntu 14.0.4.2+, Debian 7+) is shipped with a 3.x+ version so it should work out of the box. If your distro is not working with the pre-compiled version properly, please submit an issue with the details, and build dnsmonster manually using this section Build Manually.

Container

Since dnsmonster uses raw packet capture funcationality, Docker/Podman daemon must grant the capability to the container

sudo docker run --rm -it --net=host --cap-add NET_RAW --cap-add NET_ADMIN --name dnsmonster ghcr.io/mosajjal/dnsmonster:latest --devName lo --stdoutOutputType=1

Build Manually

  • with libpcap: Make sure you have go, libpcap-devel and linux-headers packages installed. The name of the packages might differ based on your distribution. After this, simply clone the repository and run go build .
git clone https://github.com/mosajjal/dnsmonster --depth 1 /tmp/dnsmonster 
cd /tmp/dnsmonster
go get
go build -o dnsmonster .
  • without libpcap: dnsmonster only uses one function from libpcap, and that is converting the tcpdump-style filters into BPF bytecode. If you can live with no BPF support, you can build dnsmonster without libpcap. Note that for any other platform, the packet capture falls back to libpcap so it becomes a hard dependency (*BSD, Windows, Darwin)
git clone https://github.com/mosajjal/dnsmonster --depth 1 /tmp/dnsmonster 
cd /tmp/dnsmonster
go get
go build -o dnsmonster -tags nolibpcap .

The above build also works on ARMv7 (RPi4) and AArch64.

Build Statically

If you have a copy of libpcap.a, you can build the statically link it to dnsmonster and build it fully statically. In the code below, please change /root/libpcap-1.9.1/libpcap.a to the location of your copy.

git clone https://github.com/mosajjal/dnsmonster --depth 1 /tmp/dnsmonster
cd /tmp/dnsmonster/
go get
go build --ldflags "-L /root/libpcap-1.9.1/libpcap.a -linkmode external -extldflags \"-I/usr/include/libnl3 -lnl-genl-3 -lnl-3 -static\"" -a -o dnsmonster

For more information on how the statically linked binary is created, take a look at this Dockerfile.

Windows

Bulding on Windows is much the same as Linux. Just make sure that you have npcap. Clone the repository (--history 1 works), and run go get and go build .

As mentioned, Windows release of the binary depends on npcap to be installed. After installation, the binary should work out of the box. I've tested it in a Windows 10 environment and it ran without an issue. To find interface names to give -devName parameter and start sniffing, you'll need to do the following:

  • open cmd.exe as Administrator and run the following: getmac.exe, you'll see a table with your interfaces' MAC address and a Transport Name column with something like this: \Device\Tcpip_{16000000-0000-0000-0000-145C4638064C}
  • run dnsmonster.exe in cmd.exe like this:
dnsmonster.exe --devName \Device\NPF_{16000000-0000-0000-0000-145C4638064C}

Note that you must change \Tcpip from getmac.exe to \NPF and then pass it to dnsmonster.exe.

FreeBSD and MacOS

Much the same as Linux and Windows, make sure you have git, libpcap and go installed, then follow the same instructions:

git clone https://github.com/mosajjal/dnsmonster --depth 1 /tmp/dnsmonster 
cd /tmp/dnsmonster
go get
go build -o dnsmonster .

Architecture

AIO Installation using Docker

Basic AIO Diagram

In the example diagram, the egress/ingress of the DNS server traffic is captured, after that, an optional layer of packet aggregation is added before hitting the DNSMonster Server. The outbound data going out of DNS Servers is quite useful to perform cache and performance analysis on the DNS fleet. If an aggregator is not available for you, you can have both TAPs connected directly to DNSMonster and have two DNSMonster Agents looking at the traffic.

running ./autobuild.sh creates multiple containers:

  • multiple instances of dnsmonster to look at the traffic on any interface. Interface list will be prompted as part of autobuild.sh
  • an instance of clickhouse to collect dnsmonster's output and saves all the logs/data to a data and logs directory. Both will be prompted as part of autobuild.sh
  • an instance of grafana looking at the clickhouse data with pre-built dashboard.

AIO Demo

AIO Demo

Enterprise Deployment

Basic AIO Diagram

Configuration

DNSMonster can be configured using 3 different methods. Command line options, Environment variables and configuration file. Order of precedence:

  • Command line options (Case-sensitive, camelCase)
  • Environment variables (Always upper-case)
  • Configuration file (Case-sensitive, PascalCase)
  • Default values (No configuration)

Command line options

  dnsmonster

general:
      --config=                          path to config file
                                         [$DNSMONSTER_CONFIG]
      --gcTime=                          Garbage Collection interval for tcp
                                         assembly and ip defragmentation
                                         (default: 10s) [$DNSMONSTER_GCTIME]
      --captureStatsDelay=               Duration to calculate interface stats
                                         (default: 1s)
                                         [$DNSMONSTER_CAPTURESTATSDELAY]
      --printStatsDelay=                 Duration to print capture and database
                                         stats (default: 10s)
                                         [$DNSMONSTER_PRINTSTATSDELAY]
      --maskSize4=                       Mask IPv4s by bits. 32 means all the
                                         bits of IP is saved in DB (default:
                                         32) [$DNSMONSTER_MASKSIZE4]
      --maskSize6=                       Mask IPv6s by bits. 32 means all the
                                         bits of IP is saved in DB (default:
                                         128) [$DNSMONSTER_MASKSIZE6]
      --serverName=                      Name of the server used to index the
                                         metrics. (default: default)
                                         [$DNSMONSTER_SERVERNAME]
      --tcpAssemblyChannelSize=          Size of the tcp assembler (default:
                                         10000)
                                         [$DNSMONSTER_TCPASSEMBLYCHANNELSIZE]
      --tcpResultChannelSize=            Size of the tcp result channel
                                         (default: 10000)
                                         [$DNSMONSTER_TCPRESULTCHANNELSIZE]
      --tcpHandlerCount=                 Number of routines used to handle tcp
                                         assembly (default: 1)
                                         [$DNSMONSTER_TCPHANDLERCOUNT]
      --resultChannelSize=               Size of the result processor channel
                                         size (default: 100000)
                                         [$DNSMONSTER_RESULTCHANNELSIZE]
      --logLevel=[0|1|2|3|4]             Set debug Log level, 0:PANIC, 1:ERROR,
                                         2:WARN, 3:INFO, 4:DEBUG (default: 3)
                                         [$DNSMONSTER_LOGLEVEL]
      --defraggerChannelSize=            Size of the channel to send packets to
                                         be defragged (default: 10000)
                                         [$DNSMONSTER_DEFRAGGERCHANNELSIZE]
      --defraggerChannelReturnSize=      Size of the channel where the
                                         defragged packets are returned
                                         (default: 10000)
                                         [$DNSMONSTER_DEFRAGGERCHANNELRETURNSIZ-

                                         E]
      --cpuprofile=                      write cpu profile to file
                                         [$DNSMONSTER_CPUPROFILE]
      --memprofile=                      write memory profile to file
                                         [$DNSMONSTER_MEMPROFILE]
      --gomaxprocs=                      GOMAXPROCS variable (default: -1)
                                         [$DNSMONSTER_GOMAXPROCS]
      --packetLimit=                     Limit of packets logged to clickhouse
                                         every iteration. Default 0 (disabled)
                                         (default: 0) [$DNSMONSTER_PACKETLIMIT]
      --skipDomainsFile=                 Skip outputing domains matching items
                                         in the CSV file path. Can accept a URL
                                         (http:// or https://) or path
                                         [$DNSMONSTER_SKIPDOMAINSFILE]
      --skipDomainsRefreshInterval=      Hot-Reload skipDomainsFile interval
                                         (default: 60s)
                                         [$DNSMONSTER_SKIPDOMAINSREFRESHINTERVA-

                                         L]
      --skipDomainsFileType=             skipDomainsFile type. Options: csv and
                                         hashtable. Hashtable is ONLY fqdn, csv
                                         can support fqdn, prefix and suffix
                                         logic but it's much slower (default:
                                         csv) [$DNSMONSTER_SKIPDOMAINSFILETYPE]
      --allowDomainsFile=                Allow Domains logic input file. Can
                                         accept a URL (http:// or https://) or
                                         path [$DNSMONSTER_ALLOWDOMAINSFILE]
      --allowDomainsRefreshInterval=     Hot-Reload allowDomainsFile file
                                         interval (default: 60s)
                                         [$DNSMONSTER_ALLOWDOMAINSREFRESHINTERV-

                                         AL]
      --allowDomainsFileType=            allowDomainsFile type. Options: csv
                                         and hashtable. Hashtable is ONLY fqdn,
                                         csv can support fqdn, prefix and
                                         suffix logic but it's much slower
                                         (default: csv)
                                         [$DNSMONSTER_ALLOWDOMAINSFILETYPE]
      --skipTLSVerification              Skip TLS verification when making
                                         HTTPS connections
                                         [$DNSMONSTER_SKIPTLSVERIFICATION]
      --version                          show version and quit.
                                         [$DNSMONSTER_VERSION]

help:
  -h, --help                             Print this help to stdout
      --manPage                          Print Manpage for dnsmonster to stdout
      --bashCompletion                   Print bash completion script to stdout
      --fishCompletion                   Print fish completion script to stdout
      --writeConfig=                     generate a config file based on
                                         current inputs (flags, input config
                                         file and environment variables) and
                                         write to provided path

capture:
      --devName=                         Device used to capture
                                         [$DNSMONSTER_DEVNAME]
      --pcapFile=                        Pcap filename to run
                                         [$DNSMONSTER_PCAPFILE]
      --dnstapSocket=                    dnstrap socket path. Example:
                                         unix:///tmp/dnstap.sock,
                                         tcp://127.0.0.1:8080
                                         [$DNSMONSTER_DNSTAPSOCKET]
      --port=                            Port selected to filter packets
                                         (default: 53) [$DNSMONSTER_PORT]
      --sampleRatio=                     Capture Sampling by a:b. eg
                                         sampleRatio of 1:100 will process 1
                                         percent of the incoming packets
                                         (default: 1:1)
                                         [$DNSMONSTER_SAMPLERATIO]
      --dnstapPermission=                Set the dnstap socket permission, only
                                         applicable when unix:// is used
                                         (default: 755)
                                         [$DNSMONSTER_DNSTAPPERMISSION]
      --packetHandlerCount=              Number of routines used to handle
                                         received packets (default: 2)
                                         [$DNSMONSTER_PACKETHANDLERCOUNT]
      --packetChannelSize=               Size of the packet handler channel
                                         (default: 1000)
                                         [$DNSMONSTER_PACKETCHANNELSIZE]
      --afpacketBuffersizeMb=            Afpacket Buffersize in MB (default:
                                         64) [$DNSMONSTER_AFPACKETBUFFERSIZEMB]
      --filter=                          BPF filter applied to the packet
                                         stream. If port is selected, the
                                         packets will not be defragged.
                                         (default: ((ip and (ip[9] == 6 or
                                         ip[9] == 17)) or (ip6 and (ip6[6] ==
                                         17 or ip6[6] == 6 or ip6[6] == 44))))
                                         [$DNSMONSTER_FILTER]
      --useAfpacket                      Use AFPacket for live captures.
                                         Supported on Linux 3.0+ only
                                         [$DNSMONSTER_USEAFPACKET]
      --noEtherframe                     The PCAP capture does not contain
                                         ethernet frames
                                         [$DNSMONSTER_NOETHERFRAME]

output:
      --clickhouseAddress=               Address of the clickhouse database to
                                         save the results (default:
                                         localhost:9000)
                                         [$DNSMONSTER_CLICKHOUSEADDRESS]
      --clickhouseDelay=                 Interval between sending results to
                                         ClickHouse (default: 1s)
                                         [$DNSMONSTER_CLICKHOUSEDELAY]
      --clickhouseDebug                  Debug Clickhouse connection
                                         [$DNSMONSTER_CLICKHOUSEDEBUG]
      --clickhouseSaveFullQuery          Save full packet query and response in
                                         JSON format.
                                         [$DNSMONSTER_CLICKHOUSESAVEFULLQUERY]
      --clickhouseOutputType=[0|1|2|3|4] What should be written to clickhouse.
                                         options:
                                         ;	0: Disable Output
                                         ;	1: Enable Output without any filters
                                         ;	2: Enable Output and apply
                                         skipdomains logic
                                         ;	3: Enable Output and apply
                                         allowdomains logic
                                         ;	4: Enable Output and apply both skip
                                         and allow domains logic (default: 0)
                                         [$DNSMONSTER_CLICKHOUSEOUTPUTTYPE]
      --clickhouseBatchSize=             Minimun capacity of the cache array
                                         used to send data to clickhouse. Set
                                         close to the queries per second
                                         received to prevent allocations
                                         (default: 100000)
                                         [$DNSMONSTER_CLICKHOUSEBATCHSIZE]
      --clickhouseWorkers=               Number of Clickhouse output Workers
                                         (default: 1)
                                         [$DNSMONSTER_CLICKHOUSEWORKERS]
      --clickhouseWorkerChannelSize=     Channel Size for each Clickhouse
                                         Worker (default: 100000)
                                         [$DNSMONSTER_CLICKHOUSEWORKERCHANNELSI-

                                         ZE]
      --fileOutputType=[0|1|2|3|4]       What should be written to file.
                                         options:
                                         ;	0: Disable Output
                                         ;	1: Enable Output without any filters
                                         ;	2: Enable Output and apply
                                         skipdomains logic
                                         ;	3: Enable Output and apply
                                         allowdomains logic
                                         ;	4: Enable Output and apply both skip
                                         and allow domains logic (default: 0)
                                         [$DNSMONSTER_FILEOUTPUTTYPE]
      --fileOutputPath=                  Path to output file. Used if
                                         fileOutputType is not none
                                         [$DNSMONSTER_FILEOUTPUTPATH]
      --fileOutputFormat=[json|csv]      Output format for file.
                                         options:json,csv. note that the csv
                                         splits the datetime format into
                                         multiple fields (default: json)
                                         [$DNSMONSTER_FILEOUTPUTFORMAT]
      --stdoutOutputType=[0|1|2|3|4]     What should be written to stdout.
                                         options:
                                         ;	0: Disable Output
                                         ;	1: Enable Output without any filters
                                         ;	2: Enable Output and apply
                                         skipdomains logic
                                         ;	3: Enable Output and apply
                                         allowdomains logic
                                         ;	4: Enable Output and apply both skip
                                         and allow domains logic (default: 0)
                                         [$DNSMONSTER_STDOUTOUTPUTTYPE]
      --stdoutOutputFormat=[json|csv]    Output format for stdout.
                                         options:json,csv. note that the csv
                                         splits the datetime format into
                                         multiple fields (default: json)
                                         [$DNSMONSTER_STDOUTOUTPUTFORMAT]
      --syslogOutputType=[0|1|2|3|4]     What should be written to Syslog
                                         server. options:
                                         ;	0: Disable Output
                                         ;	1: Enable Output without any filters
                                         ;	2: Enable Output and apply
                                         skipdomains logic
                                         ;	3: Enable Output and apply
                                         allowdomains logic
                                         ;	4: Enable Output and apply both skip
                                         and allow domains logic (default: 0)
                                         [$DNSMONSTER_SYSLOGOUTPUTTYPE]
      --syslogOutputEndpoint=            Syslog endpoint address, example:
                                         udp://127.0.0.1:514,
                                         tcp://127.0.0.1:514. Used if
                                         syslogOutputType is not none
                                         [$DNSMONSTER_SYSLOGOUTPUTENDPOINT]
      --kafkaOutputType=[0|1|2|3|4]      What should be written to kafka.
                                         options:
                                         ;	0: Disable Output
                                         ;	1: Enable Output without any filters
                                         ;	2: Enable Output and apply
                                         skipdomains logic
                                         ;	3: Enable Output and apply
                                         allowdomains logic
                                         ;	4: Enable Output and apply both skip
                                         and allow domains logic (default: 0)
                                         [$DNSMONSTER_KAFKAOUTPUTTYPE]
      --kafkaOutputBroker=               kafka broker address, example:
                                         127.0.0.1:9092. Used if
                                         kafkaOutputType is not none
                                         [$DNSMONSTER_KAFKAOUTPUTBROKER]
      --kafkaOutputTopic=                Kafka topic for logging (default:
                                         dnsmonster)
                                         [$DNSMONSTER_KAFKAOUTPUTTOPIC]
      --kafkaBatchSize=                  Minimun capacity of the cache array
                                         used to send data to Kafka (default:
                                         1000) [$DNSMONSTER_KAFKABATCHSIZE]
      --kafkaBatchDelay=                 Interval between sending results to
                                         Kafka if Batch size is not filled
                                         (default: 1s)
                                         [$DNSMONSTER_KAFKABATCHDELAY]
      --elasticOutputType=[0|1|2|3|4]    What should be written to elastic.
                                         options:
                                         ;	0: Disable Output
                                         ;	1: Enable Output without any filters
                                         ;	2: Enable Output and apply
                                         skipdomains logic
                                         ;	3: Enable Output and apply
                                         allowdomains logic
                                         ;	4: Enable Output and apply both skip
                                         and allow domains logic (default: 0)
                                         [$DNSMONSTER_ELASTICOUTPUTTYPE]
      --elasticOutputEndpoint=           elastic endpoint address, example:
                                         http://127.0.0.1:9200. Used if
                                         elasticOutputType is not none
                                         [$DNSMONSTER_ELASTICOUTPUTENDPOINT]
      --elasticOutputIndex=              elastic index (default: default)
                                         [$DNSMONSTER_ELASTICOUTPUTINDEX]
      --elasticBatchSize=                Send data to Elastic in batch sizes
                                         (default: 1000)
                                         [$DNSMONSTER_ELASTICBATCHSIZE]
      --elasticBatchDelay=               Interval between sending results to
                                         Elastic if Batch size is not filled
                                         (default: 1s)
                                         [$DNSMONSTER_ELASTICBATCHDELAY]
      --splunkOutputType=[0|1|2|3|4]     What should be written to HEC. options:
                                         ;	0: Disable Output
                                         ;	1: Enable Output without any filters
                                         ;	2: Enable Output and apply
                                         skipdomains logic
                                         ;	3: Enable Output and apply
                                         allowdomains logic
                                         ;	4: Enable Output and apply both skip
                                         and allow domains logic (default: 0)
                                         [$DNSMONSTER_SPLUNKOUTPUTTYPE]
      --splunkOutputEndpoints=           splunk endpoint address, example:
                                         http://127.0.0.1:8088. Used if
                                         splunkOutputType is not none
                                         [$DNSMONSTER_SPLUNKOUTPUTENDPOINTS]
      --splunkOutputToken=               Splunk HEC Token (default:
                                         00000000-0000-0000-0000-000000000000)
                                         [$DNSMONSTER_SPLUNKOUTPUTTOKEN]
      --splunkOutputIndex=               Splunk Output Index (default: temp)
                                         [$DNSMONSTER_SPLUNKOUTPUTINDEX]
      --splunkOutputSource=              Splunk Output Source (default:
                                         dnsmonster)
                                         [$DNSMONSTER_SPLUNKOUTPUTSOURCE]
      --splunkOutputSourceType=          Splunk Output Sourcetype (default:
                                         json)
                                         [$DNSMONSTER_SPLUNKOUTPUTSOURCETYPE]
      --splunkBatchSize=                 Send data to HEC in batch sizes
                                         (default: 1000)
                                         [$DNSMONSTER_SPLUNKBATCHSIZE]
      --splunkBatchDelay=                Interval between sending results to
                                         HEC if Batch size is not filled
                                         (default: 1s)
                                         [$DNSMONSTER_SPLUNKBATCHDELAY]

Environment variables

all the flags can also be set via env variables. Keep in mind that the name of each parameter is always all upper case and the prefix for all the variables is "DNSMONSTER". Example:

$ export DNSMONSTER_PORT=53
$ export DNSMONSTER_DEVNAME=lo
$ sudo -E dnsmonster

Configuration file

you can run dnsmonster using the following command to in order to use configuration file:

$ sudo dnsmonster -config=dnsmonster.ini

# Or you can use environment variables to set the configuration file path
$ export DNSMONSTER_CONFIG=dnsmonster.ini
$ sudo -E dnsmonster

What's the retention policy

The default retention policy for the ClickHouse tables is set to 30 days. You can change the number by building the containers using ./autobuild.sh. Since ClickHouse doesn't have an internal timestamp, the TTL will look at incoming packet's date in pcap files. So while importing old pcap files, ClickHouse may automatically start removing the data as they're being written and you won't see any actual data in your Grafana. To fix that, you can change TTL to a day older than your earliest packet inside the PCAP file.

NOTE: to change a TTL at any point in time, you need to directly connect to the Clickhouse server using a clickhouse client and run the following SQL statement (this example changes it from 30 to 90 days):

ALTER TABLE DNS_LOG MODIFY TTL DnsDate + INTERVAL 90 DAY;`

NOTE: The above command only changes TTL for the raw DNS log data, which is the majority of your capacity consumption. To make sure that you adjust the TTL for every single aggregation table, you can run the following:

ALTER TABLE DNS_LOG MODIFY TTL DnsDate + INTERVAL 90 DAY;
ALTER TABLE `.inner.DNS_DOMAIN_COUNT` MODIFY TTL DnsDate + INTERVAL 90 DAY;
ALTER TABLE `.inner.DNS_DOMAIN_UNIQUE` MODIFY TTL DnsDate + INTERVAL 90 DAY;
ALTER TABLE `.inner.DNS_PROTOCOL` MODIFY TTL DnsDate + INTERVAL 90 DAY;
ALTER TABLE `.inner.DNS_GENERAL_AGGREGATIONS` MODIFY TTL DnsDate + INTERVAL 90 DAY;
ALTER TABLE `.inner.DNS_EDNS` MODIFY TTL DnsDate + INTERVAL 90 DAY;
ALTER TABLE `.inner.DNS_OPCODE` MODIFY TTL DnsDate + INTERVAL 90 DAY;
ALTER TABLE `.inner.DNS_TYPE` MODIFY TTL DnsDate + INTERVAL 90 DAY;
ALTER TABLE `.inner.DNS_CLASS` MODIFY TTL DnsDate + INTERVAL 90 DAY;
ALTER TABLE `.inner.DNS_RESPONSECODE` MODIFY TTL DnsDate + INTERVAL 90 DAY;
ALTER TABLE `.inner.DNS_SRCIP_MASK` MODIFY TTL DnsDate + INTERVAL 90 DAY;

UPDATE: in the latest version of clickhouse, the .inner tables do not have the same name as the corresponding aggregation views. In order to modify the TTL you have to find the table names in UUID format using SHOW TABLES and repeat the ALTER command with those UUIDs.

Sampling and Skipping

pre-process sampling

dnsmonster supports pre-processing sampling of packet using a simple parameter: sampleRatio. this parameter accepts a "ratio" value, like "1:2". "1:2" means for each 2 packet that arrives, only process one of them (50% sampling). Note that this sampling happens AFTER bpf filters and not before. if you have an issue keeping up with the volume of your DNS traffic, you can set this to something like "2:10", meaning 20% of the packets that pass your bpf filter, will be processed by dnsmonster.

skip domains

dnsmonster supports a post-processing domain skip list to avoid writing noisy, repetitive data to your Database. The domain skip list is a csv-formatted file, with only two columns: a string and a logic for that particular string. dnsmonster supports three logics: prefix, suffix and fqdn. prefix and suffix means that only the domains starting/ending with the mentioned string will be skipped to be written to DB. Note that since we're talking about DNS questions, your string will most likely have a trailing . that needs to be included in your skip list row as well (take a look at skipdomains.csv.sample for a better view). You can also have a full FQDN match to avoid writing highly noisy FQDNs into your database.

allow domains

dnsmonster has the concept of "allowdomains", which helps building the detection if certain FQDNs, prefixes or suffixes are present in the DNS traffic. Given the fact that dnsmonster supports multiple output streams with different logic for each one, it's possible to collect all DNS traffic in ClickHouse, but collect only "allowlist" domains in stdout or in a file in the same instance of dnsmonster.

SAMPLE in clickhouse SELECT queries

By default, the main tables created by tables.sql (DNS_LOG) file have the ability to sample down a result as needed, since each DNS question has a semi-unique UUID associated with it. For more information about SAMPLE queries in Clickhouse, please check out this document.

Supported Outputs

  • Clickhouse
  • Kafka
  • Elasticsearch
  • Splunk HEC
  • Stdout
  • File
  • Syslog (Linux Only)

Roadmap

  • Down-sampling capability for SELECT queries
  • Adding afpacket support
  • Configuration file option
  • Exclude FQDNs from being indexed
  • FQDN whitelisting to only log certain domains
  • dnstap support
  • Kafka output support
  • Ability to load allowDomains and skipDomains from HTTP(S) endpoints
  • Elasticsearch output support
  • Splunk HEC output support
  • Syslog output support
  • Grafana dashboard performance improvements
  • remove libpcap dependency and move to pcapgo for packet processing
  • Splunk Dashboard
  • Kibana Dashbaord
  • Optional SSL for Clickhouse
  • De-duplication support
  • Getting the data ready to be used for ML & Anomaly Detection
  • Clickhouse versioning and migration tool
  • statsd and Prometheus support

Related projects

Comments
  • "IPv6 Packet Destination Top 20 Prefix" -- IPv6 addresses are incorrect

    In "IPv6 Packet Destination Top 20 Prefix" panel, the IPv6 addresses are incorrect.

    e.g. in my panel ,it is showing random IPs.

    Screenshot 2022-03-22 at 11 38 28

    In csv.go page, the code for converting IPv6 address to decimal is --

    SrcIP = binary.BigEndian.Uint64(d.SrcIP[:8]) //limitation of clickhouse-go doesn't let us go more than 64 bits for ipv6 at the moment DstIP = binary.BigEndian.Uint64(d.DstIP[:8])

    As per my understanding from above comment, clickhouse is not allowing more than 64 bits for a variable. Is there a way to show correct data on the panel ?

  • Question around pcap file behaviour

    Question around pcap file behaviour

    Thanks for writing a really useful tool for pcap parsing into dns json!

    I just have a question, I'm running the command as such below:

    $ dnsmonster --pcapfile="output.pcap" --fileoutputpath=dns.json --fileoutputformat=json --fileoutputtype=1
    INFO[2022-11-04T15:44:41Z] Creating the dispatch Channel
    INFO[2022-11-04T15:44:41Z] Creating File Output Channel
    INFO[2022-11-04T15:44:41Z] Using File: output.pcap
    
    INFO[2022-11-04T15:44:41Z] skipping skipDomains refresh since it's not provided
    INFO[2022-11-04T15:44:41Z] skipping allowDomains refresh since it's not provided
    WARN[2022-11-04T15:44:41Z] BPF Filter is not supported in offline mode.
    INFO[2022-11-04T15:44:41Z] Reading off Pcap file
    INFO[2022-11-04T15:44:41Z] Creating handler #0
    INFO[2022-11-04T15:44:41Z] Creating handler #1
    2022-11-04T15:44:51Z metrics: {"fileSentToOutput":{"count":136405},"fileSkipped":{"count":0},"packetLossPercent":{"value":0},"packetsCaptured":{"value":0},"packetsDropped":{"value":0},"packetsDuplicate":{"count":0},"packetsOverRatio":{"count":0}}
    2022-11-04T15:45:01Z metrics: {"fileSentToOutput":{"count":136405},"fileSkipped":{"count":0},"packetLossPercent":{"value":0},"packetsCaptured":{"value":0},"packetsDropped":{"value":0},"packetsDuplicate":{"count":0},"packetsOverRatio":{"count":0}}
    ...
    

    But this never ends? According to top/iotop the process has finished writing and I can confirm the output json file seems to have stopped writing - but dnsmonster never terminates back to the shell.

    Is this expected behaviour?

  • dnsmonser not sending all packets from pcap to clickhouse

    dnsmonser not sending all packets from pcap to clickhouse

    I noticed that the number of DNS packets stored in my local clickhouse instance were always multiples of the clickhousebatchsize: When using dnsmonster to process pcap files and clickhousebatchsize set to non-zero, the clickhouse output will not send all results to clickhouse.

    Scenario:

    • clickhousebatchsize = 100000 (default)
    • output to clickhouse
    • pcaps with number of DNS packets which is not multiple of the batch size, e.g. 1234567

    Result: dnsmonster sends info for only 1200000 packets to clickhouse, in batches of 100000. Missing the 34567 packets as the batch size was not reached.

    The code section responsible is https://github.com/mosajjal/dnsmonster/blob/928d15f5243b37e9dae6a7f4b8ffab77bb65415a/output/clickhouse.go#L222-L231

    I did not see anything that makes sure the remainder is being sent to clickhouse when the end of the pcap file has been reached.

  • Splunk Output mode >1 broken, 8.4.0

    Splunk Output mode >1 broken, 8.4.0

    Hello,

    It seems the filtering/allow logic is broken for the Splunk HEC Output as once the splunkOutputType is increased above 1, all domains get skipped, no matter what is in/not in the allow/skip files: {"level":"info","msg":"output: {Name:splunk SentToOutput:0 Skipped:99441}","time":"2021-07-13T15:03:29+10:00"} {"level":"info","msg":"{PacketsGot:99485 PacketsLost:0 PacketLossPercent:0}","time":"2021-07-13T15:03:29+10:00"}

    Config file: useAfpacket=true devName=myerspan splunkOutputType=3 skipDomainsFile=/app/dnsmonster/filterDomains.csv splunkOutputEndpoint=:8088 splunkOutputToken= skipTlsVerification=true splunkOutputIndex= splunkOutputSource= splunkOutputSourceType=

    filterDomains: empty

    Are you able to please advise if something is wrong with the config or if this bug has been fixed in commits past the 8.4.0 release?

    Thanks, Lachlan

  • Prevent packetsCaptured counter from being filled with misleading values

    Prevent packetsCaptured counter from being filled with misleading values

    Overview

    For non-dnstap handles, the implementation of Stat in addition to how its results are treated can be confusing and don't allow for flexibility. There is a logic constructed between the caller of Stat and certain handles where the return value of 0 for the number of packets captured is the same thing as 1 packet being captured. This can be evidenced by the following code, in addition to certain comments within the codebase:

    if packets == 0 { // to make up for pcap not being able to get stats
        packetsCaptured.Update(totalCnt)
    }
    
    // in printstats, we check if this is 0, and we add the total counter to this to make sure we have a better number
    return 0, 0
    

    In my case, I am using dnsmonster with a custom handle that reads samples as perf events from an eBPF map. The Stat function is therefore susceptible of returning 0 when there has been nothing new outputted to the event map. When this happens, the packetsCaptured counter is overwritten with the total number of times that the application has tried polling for a packet.

    The proposed changes do the following:

    • A third return value of type error is added to the signature of Stat such that we can clearly distinguish when there has been an error. This is preferred to simply returning 0 packets captured with no additional context or an indication that there was an error, making it impossible to differentiate when a handle simply has read 0 packets because there was nothing to read versus when there was an error that prevented it from performing the read.
    • Because of the aforementioned change, packetsCaptured can now be 0. We have to avoid the possibility of division by 0, so a guard is added around the packetLossPercent update call.
    • For handles where the underlying implementation doesn't let you pull the stats (i.e pcapFileHandle), we keep track of the number by ourselves. This serves a similar purpose to totalCnt in that we can still keep metrics about handles who either can't tell us how many packets were captured because of an error or those who don't provide a way of pulling the information directly from the source

    Testing

    This problem was noticed as follows:

    • Packet handler with Stat method returning 0, 0 for a continuous period of time under normal circumstances
    • Exponential increase in packetsCaptured despite no packets being read
    2022-09-15T16:18:59Z metrics: {"packetLossPercent":{"value":0},"packetsCaptured":{"value":5353688},"packetsDropped":{"value":0},"packetsDuplicate":{"count":0},"packetsOverRatio":{"count":0},"stdoutSentToOutput":{"count":0},"stdoutSkipped":{"count":0}}
    2022-09-15T16:19:09Z metrics: {"packetLossPercent":{"value":0},"packetsCaptured":{"value":11124805},"packetsDropped":{"value":0},"packetsDuplicate":{"count":0},"packetsOverRatio":{"count":0},"stdoutSentToOutput":{"count":0},"stdoutSkipped":{"count":0}}
    2022-09-15T16:19:19Z metrics: {"packetLossPercent":{"value":0},"packetsCaptured":{"value":16830117},"packetsDropped":{"value":0},"packetsDuplicate":{"count":0},"packetsOverRatio":{"count":0},"stdoutSentToOutput":{"count":0},"stdoutSkipped":{"count":0}}
    

    One can observe that within the span of 10 seconds, packetsCaptured is continuously increasing by over 5M. If we were to print totalCnt on each loop iteration, we would notice that this is because the counter is being set to this value instead of remaining at 0.

    While other handles were not tested (i.e livepcap), these changes should affect all of them in the same way. Furthermore, these modifications assume that, in the event of an error being thrown in Stat, it is preferable to output a warning message and avoid touching the packetsCaptured counter.

  • make use of kafka lib features

    make use of kafka lib features

    This change makes it so we take advantage of kafka-go's features such as:

    • ability to spread the messages across topic partitions
    • compression
    • internal batching
    • retries and reconnections
    • asynchronous

    It also makes the code more simple and implements the close functionality of the output.

  • Clickhouse datasource plugin showing as unsigned in grafana

    Clickhouse datasource plugin showing as unsigned in grafana

    First of all I would like to say a big thank you for making this.

    Everything gets installed fine, even getting the logs on clickhouse container when checked clickhouse-client, So that works brilliantly. However, while looking at the datasource in grafana. The local ch datasource doesn't load, tried adding new then noticed following error at the top Screenshot 2021-09-03 at 6 04 05 PM

    Tried changing the flag "allow_loading_unsigned_plugins" too, but the error is not going away. Clickhouse is not available in the list of datasources.

    Screenshot 2021-09-03 at 6 05 03 PM

  • Initial commit of Microsoft ASIM DNS Parser

    Initial commit of Microsoft ASIM DNS Parser

    Not sure if this is something that you want to have merged into the main branch. Seeing that you have sentinel output support, I wrote a set of ASIM DNS parsers for the logs that DNS monster imports into Microsoft Sentinel. This allows the log entries to be used with the Microsoft normalization analytics rules in Sentinel. More information is in the README. I don't have a greenfield Sentinel instance to test all the installation/creation documentation against again but I believe I documented it properly.

  • dnsmonser not sending packets from live interface to clickhouse

    dnsmonser not sending packets from live interface to clickhouse

    Hi !!

    Hope You are doing Well. I downloaded the latest dnsmonster(v0.9.5) binary from the release section . but dnsmonser not sending packets from live interface to clickhouse. The error is given Below :-- [4515]: time="2022-10-07T18:11:53+05:30" level=warning msg="Error while executing batch: clickhouse [Append]: clickhouse: expected 18 arguments, got 17"

    but in your previous version binary like (v0.9.2 & v0.9.3) is working fine ...

    kindly check and do the needful ....

    Thanks !!!!

  • Fix non-dnstap metrics

    Fix non-dnstap metrics

    Overview

    Fixes packetsCaptured always being overwritten for non-dnstap captures since the value of drop is being added to the wrong counter

    Testing

    This is most apparent when we we call Stat() and get a value of 0 for the number of the packets dropped, causing the line packetLossPercent.Update(float64(packetsDropped.Value()) * 100.0 / float64(packetsCaptured.Value())) to cause an attempted division by 0 which results in packetLossPercent.Value() being equal to NaN. During serialization of metrics (i.e to JSON), this can prevent them from being outputted entirely if the serializer doesn't know how to interpret NaN.

  • dnsmonster to clickhouse data replication

    dnsmonster to clickhouse data replication

    First of all thank you for this amazing dnsmonster !!! We have a cluster of three clickhouse nodes and our dns data is going through dnsmonster. But in the configuration of dnsmonster only one IP address of the clickhouse node is mentioned. The problem is that if our mentioned clickhouse node goes down, will our data replicate to the other cluster node?

  • higher precision timestamps in clickhouse

    higher precision timestamps in clickhouse

    It would be nice to have higher precision timestamps for the package's time in clickhouse.

    The current DNS_Log table: https://github.com/mosajjal/dnsmonster/blob/7ebb729d4655cd4f9ba85c800724f3c3313d049b/clickhouse/tables.sql#L1-L3

    Am not sure why IndexTime is higher precision, but PacketTime is just seconds.

    The PostgreSQL output uses timestamp for both https://github.com/mosajjal/dnsmonster/blob/928d15f5243b37e9dae6a7f4b8ffab77bb65415a/output/postgres.go#L80 which is already higher precision, see https://www.postgresql.org/docs/current/datatype-datetime.html#DATATYPE-DATETIME-INPUT

    Maybe the PacketTime column in clickhouse's DNS_LOG could be changed to the higher-resolution https://clickhouse.com/docs/en/sql-reference/data-types/datetime64 as well?

    Just changing the data type for the column seems to work.

  • ClickHouse over network

    ClickHouse over network

    I made a small change that leverages builtin driver's support. Most likely I'll make a change to the way the Address is provided so we can have different credentials and TLS support for each address later on. Please let me know if the latest commit solves your issue. Happy to re-open if it doesn't

    Originally posted by @mosajjal in https://github.com/mosajjal/dnsmonster/issues/27#issuecomment-1144723538

Capdns is a network capture utility designed specifically for DNS traffic. This utility is based on tcpdump.
Capdns is a network capture utility designed specifically for DNS traffic. This utility is based on tcpdump.

Capdns is a network capture utility designed specifically for DNS traffic. This utility is based on tcpdump. Some of its features include: Unde

Feb 26, 2022
Subfinder is a subdomain discovery tool that discovers valid subdomains for websites by using passive online sources
Subfinder is a subdomain discovery tool that discovers valid subdomains for websites by using passive online sources

Subfinder is a subdomain discovery tool that discovers valid subdomains for websites. Designed as a passive framework to be useful for bug bounties and safe for penetration testing.

Jan 1, 2023
Tool for monitoring network devices (mainly using SNMP) - monitoring check plugin
Tool for monitoring network devices (mainly using SNMP) - monitoring check plugin

Thola Description A tool for monitoring network devices written in Go. It features a check mode which complies with the monitoring plugins development

Dec 29, 2022
DNS Ping: to check packet loss and latency issues with DNS servers

DNSping DNS Ping checks packet loss and latency issues with DNS servers Installation If you have golang, easiest install is go get -u fortio.org/dnspi

Nov 18, 2022
Verify IP addresses of respectful crawlers like Googlebot by reverse dns and forward dns lookups
Verify IP addresses of respectful crawlers like Googlebot by reverse dns and forward dns lookups

goodbots - trust but verify goodbots verifies the IP addresses of respectful crawlers like Googlebot by performing reverse dns and forward dns lookups

Aug 16, 2022
The Dual-Stack Dynamic DNS client, the world's first dynamic DNS client built for IPv6.

dsddns DsDDNS is the Dual-Stack Dynamic DNS client. A dynamic DNS client keeps your DNS records in sync with the IP addresses associated with your hom

Sep 27, 2022
netcup DNS module for caddy: dns.providers.netcup

netcup DNS module for Caddy This package contains a DNS provider module for Caddy. It can be used to manage DNS records with the netcup DNS API using

Nov 9, 2022
A fork on miekg/dns (since I've already forked zmap/dns)

Alternative (more granular) approach to a DNS library Less is more. Complete and usable DNS library. All Resource Records are supported, including the

Jan 19, 2022
A simple DNS forwarder that forwards DNS queries to various upstreams

A simple DNS forwarder that forwards DNS queries to various upstreams. If an upstream returns NXDomain, the next upstream is tried.

Jul 8, 2022
Capture packet request/response pairs for a port and/or IP to aid in Network protocol based Nuclei Templates creation.

network-fingerprint Capture packet request/response pairs for a port and/or IP to aid in Network protocol based Nuclei Templates creation. Resources I

Nov 15, 2022
dumpr! is a tool to capture text based tcp traffic from the receivers point of view.
dumpr! is a tool to capture text based tcp traffic from the receivers point of view.

dumpr! dumpr! is a tool to capture text based tcp traffic. The project came about for the need to capture a web request from the back end. It was also

Dec 4, 2021
Swiss Army knife Proxy tool for HTTP/HTTPS traffic capture, manipulation, and replay on the go.
Swiss Army knife Proxy tool for HTTP/HTTPS traffic capture, manipulation, and replay on the go.

Features • Installation • Usage • Running Proxify • Installing SSL Certificate • Applications of Proxify • Join Discord Swiss Army Knife Proxy for rap

Jan 8, 2023
A simple network analyzer that capture http network traffic
A simple network analyzer that capture http network traffic

httpcap A simple network analyzer that captures http network traffic. support Windows/MacOS/Linux/OpenWrt(x64) https only capture clienthello colorful

Oct 25, 2022
Capture sensor data from Xiaomi thermometers (via BLE) and expose it as prometheus metrics

sensor-probe Sensor Probe is a small utility that reads advertisement data sent by the Xiaomi Thermometer LYWSD03MMC via Bluetooth LE and expose them

Oct 13, 2022
Home Assistant screenshot capture web server suitable for e-ink displays

hass-shooter hass-shooter is a Home Assistant screenshot capture web server suitable for e-ink displays. Dependencies Chromium ImageMagick Installatio

Jan 28, 2022
Squzy - is a high-performance open-source monitoring, incident and alert system written in Golang with Bazel and love.

Squzy - opensource monitoring, incident and alerting system About Squzy - is a high-performance open-source monitoring and alerting system written in

Dec 12, 2022
A proxy to add `/federate` to Thanos monitoring

Thanos Federate Proxy A proxy to convert /federate queries to /v1/api/query and respond in open metrics format. The most common use case for this prox

Jul 25, 2022
Schema-free, document-oriented streaming database that optimized for monitoring network traffic in real-time

Basenine Schema-free, document-oriented streaming database that optimized for monitoring network traffic in real-time. Featured Aspects Has the fastes

Nov 2, 2022
DNS library in Go

Alternative (more granular) approach to a DNS library Less is more. Complete and usable DNS library. All Resource Records are supported, including the

Jan 8, 2023