Monitor your network and internet speed with Docker & Prometheus

Internet Monitoring Docker Stack with Prometheus + Grafana

This repository is a fork from maxandersen/internet-monitoring, tailored for use on a Raspberry Pi. It has only been tested on a Raspberry Pi 4 running Pi OS 64-bit beta.

Stand-up a Docker Prometheus stack containing Prometheus, Grafana with blackbox-exporter, and speedtest-exporter to collect and graph home Internet reliability and throughput.

Pre-requisites

Make sure Docker and Docker Compose are installed on your Docker host machine.

Quick Start

git clone https://github.com/geerlingguy/internet-monitoring
cd internet-monitoring
docker-compose up -d

Go to http://localhost:3030/d/o9mIe_Aik/internet-connection (change localhost to your docker host ip/name).

Configuration

To change what hosts you ping you change the targets section in /prometheus/pinghosts.yaml file.

For speedtest the only relevant configuration is how often you want the check to happen. It is at 30 minutes by default which might be too much if you have limit on downloads. This is changed by editing scrape_interval under speedtest in /prometheus/prometheus.yml.

Once configurations are done, run the following command:

$ docker-compose up -d

That's it. docker-compose builds the entire Grafana and Prometheus stack automagically.

The Grafana Dashboard is now accessible via: http://:3030 for example http://localhost:3030

username - admin password - wonka (Password is stored in the config.monitoring env file)

The DataSource and Dashboard for Grafana are automatically provisioned.

If all works it should be available at http://localhost:3030/d/o9mIe_Aik/internet-connection - if no data shows up try change the timeduration to something smaller.

Interesting urls

http://localhost:9090/targets shows status of monitored targets as seen from prometheus - in this case which hosts being pinged and speedtest. note: speedtest will take a while before it shows as UP as it takes about 30s to respond.

http://localhost:9090/graph?g0.expr=probe_http_status_code&g0.tab=1 shows prometheus value for probe_http_status_code for each host. You can edit/play with additional values. Useful to check everything is okey in prometheus (in case Grafana is not showing the data you expect).

http://localhost:9115 blackbox exporter endpoint. Lets you see what have failed/succeded.

http://localhost:9798/metrics speedtest exporter endpoint. Does take about 30 seconds to show its result as it runs an actual speedtest when requested.

Thanks and a disclaimer

Thanks to @maxandersen for making the original project this fork is based on.

Thanks to @vegasbrianc work on making a super easy docker stack for running prometheus and grafana.

This setup is not secured in any way, so please only use on non-public networks, or find a way to secure it on your own.

Owner
Jeff Geerling
Catholic dad and husband. I write, build, and tinker on a Mac. #stl #drupal #ansible #k8s #raspberrypi #crohns
Jeff Geerling
Comments
  • Grafana container keeps restarting after hundreds of 'Client.Timeout exceeded while awaiting headers' errors

    Grafana container keeps restarting after hundreds of 'Client.Timeout exceeded while awaiting headers' errors

    Here's what I see in the logs this evening:

    Error: ✗ Failed to send request: Get "https://grafana.com/api/plugins/repo/flant-statusmap-panel": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
    

    (This repeats over 3000 times.)

    A quick DDG search doesn't see anything obvious, and I can hit https://grafana.com/api/plugins/repo/flant-statusmap-panel from my browser okay. Not sure what's up, but it's causing the Grafana container to keep restarting and never fully launch the web UI.

  • Speedtest container goes unhealthy after a few hours (consistently)

    Speedtest container goes unhealthy after a few hours (consistently)

    The speedtest container seems to just go unhealthy from time to time, with nothing in the logs indicating an issue. It's just like the flask app locks up and stops returning HTTP responses altogether.

    This leads to one of my favorite graphs having blank periods until I log into the Pi and manually restart the container (composer restart speedtest):

    Screen Shot 2021-04-09 at 9 47 56 AM

    I asked about this problem upstream here: https://github.com/MiguelNdeCarvalho/speedtest-exporter/issues/48#issuecomment-816360616

    But I'm thinking of adding a simple cron job / script that checks if the container is unhealthy (maybe every 5 or 10 minutes), and if so, restarts it.

  • Speedtest Choose specific server

    Speedtest Choose specific server

    speedtest-cli allows you to manually choose a server instead of auto picking the best server. This can be using using the -s flag with a server_id example speedtest -s 35945 this allows us to tet the speeds without letting speedtest choose the best server, which might not pick the best server always and won't provide the most accurate data.

    also maybe adding a way to test multiple servers could be a good idea since my ISP only promises speeds up to their node i can monitor speeds on different nodes / countries to check their peering and routing. I Currently try to do this with a cronjob and a python script which stores data to a sqlite DB. Grafana and Prometheus solution might be a much better choice in this case.

  • Add Starlink monitoring

    Add Starlink monitoring

    Through my internet-pi project, I'd like to add in Starlink monitoring using https://github.com/danopstech/starlink_exporter (though I've run into one issue here: https://github.com/danopstech/starlink_exporter/issues/1

  • Speedtest container initialisation error

    Speedtest container initialisation error

    Description

    on startup the speedtest container crashes (and restarts continuously) with the following error:

    speedtest_1   | Current thread 0xb6fa0390 (most recent call first):
    speedtest_1   | <no Python frame>
    speedtest_1   | Fatal Python error: init_interp_main: can't initialize time
    speedtest_1   | Python runtime state: core initialized
    speedtest_1   | PermissionError: [Errno 1] Operation not permitted
    

    Environment

    The OS for the raspberry pi was the provided raspberry OS from factory (which already makes me think I should make a fresh, controlled, install by myself).

    The environment this is running is:

    +pi@raspberrypi:~ $ whoami
    pi
    +pi@raspberrypi:~ $ groups
    pi adm dialout cdrom sudo audio video plugdev games users input netdev lpadmin docker gpio i2c spi
    +pi@raspberrypi:~ $ uname -a
    Linux raspberrypi 5.10.17-v7l+ #1421 SMP Thu May 27 14:00:13 BST 2021 armv7l GNU/Linux
    +pi@raspberrypi:~ $ cat /proc/cpuinfo  | grep Model
    Model		: Raspberry Pi 4 Model B Rev 1.4
    

    Instalation

    The project setup was done following your article, up to the point of docker-compose up -d (since I'm running this locally and don't need remote access).

    Fixes attempted

    • Changed speedtest container to different versions ( 2.8, 3.0 and 3.2.2)
    • Run the container with elevated privileges (which led me to a perhaps rash PR #16 )

    To Do

  • Gauge Panels return N/A if left running

    Gauge Panels return N/A if left running

    image

    Left docker containers running for over 24 hours and all three gauges started to return N/A.

    Fixed by removing the "max data points" in the query options for the panels.

    I am by no means an expert in this, I just like to play around with cool projects. Is this the correct way to resolve?

  • speed test not working

    speed test not working

    I am having a hard time getting the speed test to work. This is my first time working with docker and grafana-as a result I am unsure how import the speed-test portion into the dashboard. When I navigate to /targets, I can see that it is down.

    2021-06-22-123108_1920x1080_scrot

    Any help? Thanks

  • Speedtest-Exporter container healthy, but data not showing in Grafana

    Speedtest-Exporter container healthy, but data not showing in Grafana

    Hi there,

    I am not sure if I may be doing something wrong, but the speedtest-exporter container is healthy and up, but on the Grafana dashboards on :3030, I am only occasionally seeing speedtest data.

    image

    image

    Any clues as to what may be happening?

    Thanks. ilyesm

  • Uptime graph hides outages when zoomed out over longer periods

    Uptime graph hides outages when zoomed out over longer periods

    Hi,

    Is it possible to mark a square in the statusmap red when downtime occurred and also keep it red in larger timespans? Like if one minute is red also make the day red?

  • Speedtest container keeps restarting

    Speedtest container keeps restarting

    I have a 2 GB Raspberry Pi running 32-bit PiOS that I otherwise use as an ADS-B receiver. I installed this repo on it, but the speedtest container keeps restarting. This is what I see added in the container log every minute: Current thread 0xb6f76390 (most recent call first): Fatal Python error: init_interp_main: can't initialize time Python runtime state: core initialized PermissionError: [Errno 1] Operation not permitted

    What could be wrong?

  • Speedtest-Exporter always Unhealthy Status

    Speedtest-Exporter always Unhealthy Status

    Logs show [2021-04-15 14:16:46.303] [error] Configuration - Cannot retrieve configuration document (0) ,[2021-04-15 14:16:46.304] [error] ConfigurationError - Could not retrieve or read configuration (Configuration) ,[2021-04-15 14:16:46.304] [error] ConfigurationError - Could not retrieve or read configuration (Configuration) ,{ , "type": "log", , "timestamp": "2021-04-15T14:16:46Z", , "message": "Configuration - Could not retrieve or read configuration (ConfigurationError)", , "level": "error" ,} ,ERROR:Speedtest-Exporter:Exception on /metrics [GET] ,Traceback (most recent call last): , File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 2447, in wsgi_app , response = self.full_dispatch_request() , File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 1952, in full_dispatch_request , rv = self.handle_user_exception(e) , File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 1821, in handle_user_exception , reraise(exc_type, exc_value, tb) , File "/usr/local/lib/python3.9/site-packages/flask/_compat.py", line 39, in reraise , raise value , File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 1950, in full_dispatch_request , rv = self.dispatch_request() , File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 1936, in dispatch_request , return self.view_functionsrule.endpoint , File "/app/exporter.py", line 70, in updateResults , r_server, r_jitter, r_ping, r_download, r_upload, r_status = runTest() , File "/app/exporter.py", line 47, in runTest , output = subprocess.check_output(cmd) , File "/usr/local/lib/python3.9/subprocess.py", line 424, in check_output , return run(*popenargs, stdout=PIPE, timeout=timeout, check=True, , File "/usr/local/lib/python3.9/subprocess.py", line 528, in run , raise CalledProcessError(retcode, process.args, ,subprocess.CalledProcessError: Command '['speedtest', '--format=json-pretty', '--progress=no', '--accept-license', '--accept-gdpr']' returned non-zero exit status 2. ,[2021-04-15 14:38:22.019] [error] Trying to get interface information on non-initialized socket. ,[2021-04-15 14:38:27.535] [error] Configuration - Couldn't resolve host name (HostNotFoundException) ,[2021-04-15 14:38:27.535] [error] Configuration - Cannot retrieve configuration document (0) ,[2021-04-15 14:38:27.536] [error] ConfigurationError - Could not retrieve or read configuration (Configuration) ,[2021-04-15 14:38:27.536] [error] ConfigurationError - Could not retrieve or read configuration (Configuration) ,{ , "type": "log", , "timestamp": "2021-04-15T14:38:27Z", , "message": "Configuration - Could not retrieve or read configuration (ConfigurationError)", , "level": "error" ,} ,ERROR:Speedtest-Exporter:Exception on /metrics [GET] ,Traceback (most recent call last): , File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 2447, in wsgi_app , response = self.full_dispatch_request() , File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 1952, in full_dispatch_request , rv = self.handle_user_exception(e) , File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 1821, in handle_user_exception , reraise(exc_type, exc_value, tb) , File "/usr/local/lib/python3.9/site-packages/flask/_compat.py", line 39, in reraise , raise value , File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 1950, in full_dispatch_request , rv = self.dispatch_request() , File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 1936, in dispatch_request , return self.view_functionsrule.endpoint , File "/app/exporter.py", line 70, in updateResults , r_server, r_jitter, r_ping, r_download, r_upload, r_status = runTest() , File "/app/exporter.py", line 47, in runTest , output = subprocess.check_output(cmd) , File "/usr/local/lib/python3.9/subprocess.py", line 424, in check_output , return run(*popenargs, stdout=PIPE, timeout=timeout, check=True, , File "/usr/local/lib/python3.9/subprocess.py", line 528, in run , raise CalledProcessError(retcode, process.args, ,subprocess.CalledProcessError: Command '['speedtest', '--format=json-pretty', '--progress=no', '--accept-license', '--accept-gdpr']' returned non-zero exit status 2. ,[2021-04-15 16:28:01.854] [error] Trying to get interface information on non-initialized socket. ,[2021-04-15 16:28:07.373] [error] Configuration - Couldn't resolve host name (HostNotFoundException) ,[2021-04-15 16:28:07.373] [error] Configuration - Cannot retrieve configuration document (0) ,[2021-04-15 16:28:07.374] [error] ConfigurationError - Could not retrieve or read configuration (Configuration) ,[2021-04-15 16:28:07.374] [error] ConfigurationError - Could not retrieve or read configuration (Configuration) ,{ , "type": "log", , "timestamp": "2021-04-15T16:28:07Z", , "message": "Configuration - Could not retrieve or read configuration (ConfigurationError)", , "level": "error" ,} ,ERROR:Speedtest-Exporter:Exception on /metrics [GET] ,Traceback (most recent call last): , File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 2447, in wsgi_app , response = self.full_dispatch_request() , File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 1952, in full_dispatch_request , rv = self.handle_user_exception(e) , File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 1821, in handle_user_exception , reraise(exc_type, exc_value, tb) , File "/usr/local/lib/python3.9/site-packages/flask/_compat.py", line 39, in reraise , raise value , File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 1950, in full_dispatch_request , rv = self.dispatch_request() , File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 1936, in dispatch_request , return self.view_functionsrule.endpoint , File "/app/exporter.py", line 70, in updateResults , r_server, r_jitter, r_ping, r_download, r_upload, r_status = runTest() , File "/app/exporter.py", line 47, in runTest , output = subprocess.check_output(cmd) , File "/usr/local/lib/python3.9/subprocess.py", line 424, in check_output , return run(*popenargs, stdout=PIPE, timeout=timeout, check=True, , File "/usr/local/lib/python3.9/subprocess.py", line 528, in run , raise CalledProcessError(retcode, process.args, ,subprocess.CalledProcessError: Command '['speedtest', '--format=json-pretty', '--progress=no', '--accept-license', '--accept-gdpr']' returned non-zero exit status 2. ,

  • Archiving repository - moved to `internet-pi` (can still be used independently)

    Archiving repository - moved to `internet-pi` (can still be used independently)

    I will soon be archiving this repository, after closing out remaining issues.

    I have merged it into my internet-pi repository, though most of the configuration could still be used completely independently, if you copy out the internet-monitoring directory, then manually template out the relevant files (prometheus and munin config files) from the templates directory.

    Otherwise, I recommend setting it up using the playbook included in the internet-pi repository, which sets everything up for you and gets it running on Raspberry Pi OS or any Debian-like OS.

Cloudprober is a monitoring software that makes it super-easy to monitor availability and performance of various components of your system.

Cloudprober is a monitoring software that makes it super-easy to monitor availability and performance of various components of your system. Cloudprobe

Dec 30, 2022
Hidra is a tool to monitor all of your services without making a mess.

hidra Don't lose your mind monitoring your services. Hidra lends you its head. ICMP If you want to use ICMP scenario, you should activate on your syst

Nov 8, 2022
Monitor the performance of your Ethereum 2.0 staking pool.

eth-pools-metrics Monitor the performance of your Ethereum 2.0 staking pool. Just input the withdrawal credentials that were used in the deposit contr

Dec 30, 2022
Monitor & detect crashes in your Kubernetes(K8s) cluster
Monitor & detect crashes in your Kubernetes(K8s) cluster

kwatch kwatch helps you monitor all changes in your Kubernetes(K8s) cluster, detects crashes in your running apps in realtime, and publishes notificat

Dec 28, 2022
Gowl is a process management and process monitoring tool at once. An infinite worker pool gives you the ability to control the pool and processes and monitor their status.
Gowl is a process management and process monitoring tool at once. An infinite worker pool gives you the ability to control the pool and processes and monitor their status.

Gowl is a process management and process monitoring tool at once. An infinite worker pool gives you the ability to control the pool and processes and monitor their status.

Nov 10, 2022
Scraping medium blogs to make them loadable with shitty internet and have a pleasant reading experience

Unmedium This project is still WIP We all know medium right? A bunch of JS, wast

Mar 20, 2022
SigNoz helps developer monitor applications and troubleshoot problems in their deployed applications
SigNoz helps developer monitor applications and troubleshoot problems in their deployed applications

SigNoz helps developers monitor their applications & troubleshoot problems, an open-source alternative to DataDog, NewRelic, etc. ?? ??

Dec 27, 2022
Monitor a process and trigger a notification.
Monitor a process and trigger a notification.

noti Monitor a process and trigger a notification. Never sit and wait for some long-running process to finish. Noti can alert you when it's done. You

Jan 3, 2023
Kubernetes monitor
Kubernetes monitor

模式说明 对应配置项为collect_mode cadvisor_plugin | kubelet_agent | server_side 三选一 代码为同一套代码 模式名称 部署运行方式 collect_mode配置 说明 夜莺插件形式采集cadvisor raw api 可执行的插件由夜莺age

Nov 18, 2022
MySQL Monitor Script

README.md Introduction mymon(MySQL-Monitor) 是Open-Falcon用来监控MySQL数据库运行状态的一个插件,采集包括global status, global variables, slave status以及innodb status等MySQL运行

Dec 26, 2022
Open Source Supreme Monitor Based on GoLang

Open Source Supreme Monitor Based on GoLang A module built for personal use but ended up being worthy to have it open sourced.

Nov 4, 2022
SigNoz helps developers monitor their applications & troubleshoot problems, an open-source alternative to DataDog, NewRelic, etc. 🔥 🖥. 👉 Open source Application Performance Monitoring (APM) & Observability tool
SigNoz helps developers monitor their applications & troubleshoot problems, an open-source alternative to DataDog, NewRelic, etc. 🔥 🖥.   👉  Open source Application Performance Monitoring (APM) & Observability tool

Monitor your applications and troubleshoot problems in your deployed applications, an open-source alternative to DataDog, New Relic, etc. Documentatio

Sep 24, 2021
Go Huobi Market Price Data Monitor
Go Huobi Market Price Data Monitor

火币(Huobi)价格监控 由于部分交易对火币官方未提供价格监控,因此写了个小程序,长期屯币党可以用它来提醒各种现货价格。 该工具只需要提前安装Go环境和Redis即可。 消息推送使用的「钉钉」,需要提前配置好钉钉机器人(企业群类型、带webhook的机器人)。 使用方法 下载本项目 拷贝根目录下

Oct 13, 2022
Productivity analytics monitor 🧮

Productivity analytics monitor ??

Oct 8, 2021
Gomon - Go language based system monitor
Gomon - Go language based system monitor

Copyright © 2021 The Gomon Project. Welcome to Gomon, the Go language based system monitor Welcome to Gomon, the Go language based system monitor Over

Nov 18, 2022
Fast, zero config web endpoint change monitor
Fast, zero config web endpoint change monitor

web monitor fast, zero config web endpoint change monitor. for comparing responses, a selected list of http headers and the full response body is stor

Nov 17, 2022
Monitor pipe progress via output to standard error.

Pipe Monitor Monitor pipe progress via output to standard error. Similar to functionality provided by the Pipe Viewer (pv) command, except this comman

Nov 14, 2022
The open and composable observability and data visualization platform. Visualize metrics, logs, and traces from multiple sources like Prometheus, Loki, Elasticsearch, InfluxDB, Postgres and many more.
The open and composable observability and data visualization platform. Visualize metrics, logs, and traces from multiple sources like Prometheus, Loki, Elasticsearch, InfluxDB, Postgres and many more.

The open-source platform for monitoring and observability. Grafana allows you to query, visualize, alert on and understand your metrics no matter wher

Jan 3, 2023
The Prometheus monitoring system and time series database.

Prometheus Visit prometheus.io for the full documentation, examples and guides. Prometheus, a Cloud Native Computing Foundation project, is a systems

Dec 31, 2022