This POC is built with the goal to collect events/logs from the host systems such as Kubernetes, Docker, VMs, etc. A buffering layer is added to buffer events from the collector

What is does

This POC is build with the goal to collect events/logs from the host systems such as Kubernetes, docker, VMs etc. A buffering layer is added to buffer events from the collector making sure ingestion layer is highly available and collector should be able to dump events without any issues. Events stored in persistance storage e.g. Kafka and routed to destinations as required.

Design and Arch

Below is the very high level design of the project. It uses below tech stack

  • Fluent Bit - As the logs/events collector
  • GRPC - As communication layer B/W Fluent bit and Ingester
  • Kafka - Events are stored in Kafka - Persistent storage

Table of Contents

Fluentbit Agent

Fluentbit-Agent is used to collect events from the host system. A Fluentbit plugin is written that communicates with the Ingester API and pushes the events to it.

When you start the FB-Agent it will require the ACCESS_KEY and ACCESS_TOKEN

  • ACCESS_KEY Configuration for an agent is stored againts this access key. FB-Agent uses access key to obtain the configuration from Ingester API
  • ACCESS_TOKEN is the JWT authentication token. FB-Agent will include it with every request it makes to Ingester API. Ingester will validate the token

It does the following

  • Connect to Ingester
  • Exchange the config - requires ACCESS_TOKEN
  • Stream events to ingester - requires ACCESS_TOKEN

Config Exchange

Below is the diagram that shows the config exchange between the FB-Agent and Ingester. Config exchange requires the valid ACCESS_TOKEN and ACCESS_KEY. It accepst the ACCESS_KEY as request parameter.

Streaming Events

It uses the Streaming RPC to stream events from the FB-Agent to the ingester API. It requires the valid ACCESS_TOKEN to be passed with every request.

Features

  • AES encryption
  • Server side TLS
  • Streaming RPC - Events

Open Issues

  • Dynamic reloading FB-Agent Config
  • Handling Auth Errors
  • Handling Connectivity Error
  • Handling Streaming Errors/timeouts

Ingester

Ingester is responsible for running the RPC server. FB-Agent sends the events to the Ingester via streaming RPC. Its also has the config for each FB-Agent. It validates the incoming request with passed ACCESS_TOKEN and pushes the events to the Kafka topic fb-kafka.

It does the following

  • Host the config for FB-Agents
  • Accept the incoming Events from FB-Agents
  • Perform JWT Authentication
  • Push Events to Kafka topic fb-kafka

NOTE: Temporarily it hosts the config for the FB-Agent. Ideally it should be stored in seperate config service. It loads the config from this JSON file. Where each key is ACCESS_KEY that can be used with FB-Agent.

IDEA: Some king of hash can be calculated using customer event data that pushes the events to specific partiton of the kafka topic, and then only one router can be spawn to consume events from that partiton only .

Open Issues

  • Decouple config management
  • Handling Auth Errors
  • Handling Kafka Erros

Router

Router consumes the events from the kafka topic fb-kafka and writes it to the specified destinations. Currenlt it is very minimal and it only writes the events to the file (ACCESS_KEY.log) on local diks.

It does the following

  • Consume events from fb-kafka topic
  • Write events to file ACCESS_KEY.log) on local diks. One file is created per ACCESS_KEY

Open Issues

  • Routing to the different targets
  • Ideally you will write the events to COS, S3, Elastic Search etc. Its not implemented yet.

Running The Project

Start Kafka

Go to the project root and run the following command to start the kafka container on docker. Please note that container data is not stored on the disk for now.

docker-compose -f kafka-docker-compose.yaml up -d

Start the ingester

  1. Go to the project root and run the following command to build it
cd ingester
go build .
  1. Run the following command to generate the ACCESS_TOKEN. It will be required when you start the FB-Agent
./ingester access-token --expiry=100000000000 
eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJleHAiOjkzODc1OTk5ODAsImlhdCI6MTYyMTMyMDM0OSwicm9sZSI6IiJ9.8Rpy5M2l-BJ-pD75q8UukLKmSIJvt-O-DytkvtOwbFY

Available commands

Usage:
  Ingester access-token [flags]

Flags:
      --expiry int   Expiry duration of token in seconds (default 600)
  -h, --help         help for access-token
  1. Run the following command to start the Ingester Server
./ingester api --print-events 

Available commands

Usage:
  Ingester api [flags]

Flags:
  -h, --help           help for api
      --print-events   Print events as received from Fluentbit-Agent

Start the router

  1. Open a new terminal, go to the root of the project and run the following command to build it
cd router
go build .
  1. Run the following command to start the router
./router consumer --print-events --decrypt-events

Available commands

Usage:
  Router consumer [flags]

Flags:
      --decrypt-events   Decrypt events received from kafka - Events written to file also will be decrypted
  -h, --help             help for consumer
      --print-events     Print events as received from Kafka

Start FB-Agent

  1. Open a new terminal, go to the root of the project and run the following command to build it
docker build . -t fluentbit-collector -f Dockerfile
  • ACCESS_TOKEN - Use the access token you created while starting the ingester
  • ACCESS_KEY - Use any of the key from this JSON file
  1. Run following command to run the container image
docker run -e ACCESS_KEY=9c60f26f-5b6c-4c80-b5f5-625bf965b6a6 -e ACCESS_TOKEN=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJleHAiOjkzODc1OTk5ODAsImlhdCI6MTYyMTMyMDM0OSwicm9sZSI6IiJ9.8Rpy5M2l-BJ-pD75q8UukLKmSIJvt-O-DytkvtOwbFY -it --rm fluentbit-collector
  1. Logs from FluentBit Agent
docker run -e ACCESS_KEY=9c60f26f-5b6c-4c80-b5f5-625bf965b6a6 -e ACCESS_TOKEN=eyJhbGciOiJIUzI1NiIsInR5c
gufran@Gufran fluentbit-grpc-events-pipeline % docker run -e ACCESS_KEY=9c60f26f-5b6c-4c80-b5f5-625bf965b6a6 -e ACCESS_TOKEN=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJleHAiOjk0MDM1ODYwOTQsImlhdCI6MTYzNzMwNjQ2Miwicm9sZSI6IiJ9.nj4Hr6MJ0F0QbgwVbWoVWtJIYDhHL9tUIfkA93yJcLs -it --rm fluentbit-collector

Fluent Bit v1.7.9
* Copyright (C) 2019-2021 The Fluent Bit Authors
* Copyright (C) 2015-2018 Treasure Data
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2021/11/19 07:52:32] [ info] [engine] started (pid=1)
[2021/11/19 07:52:32] [ info] [storage] version=1.1.1, initializing...
[2021/11/19 07:52:32] [ info] [storage] in-memory
[2021/11/19 07:52:32] [ info] [storage] normal synchronization mode, checksum disabled, max_chunks_up=128
2021/11/19 07:52:32 [out-grpc] plugin parameter = ''
[2021/11/19 07:52:32] [ info] [sp] stream processor started
2021/11/19 07:52:37 4 events sent to ingester
2021/11/19 07:52:37 4 events sent to ingester
2021/11/19 07:52:37 3 events sent to ingester
  • Observer the terminals to see the activity. Log files ${ACCESS_KEY}.log will be created inside the project root.
  • You can spin up more FB-Agent instances with different ACCESS_KEYS

Happy Hacking :-)

Similar Resources

Search and analysis tooling for structured logs

Zed The Zed system provides an open-source, cloud-native, and searchable data lake for semi-structured and structured data. Zed lakes utilize a supers

Jan 5, 2023

gtl - Gemini Tiny Logs - A simple TUI for the tinylog format on gemini

gtl - Gemini Tiny Logs - A simple TUI for the tinylog format on gemini

GTL: Gemini Tiny Logs Goal: A TUI for the tinylogs format on the gemini space. See screenshots Installation gtl requires go ≥ 1.16 From Source git clo

Dec 1, 2022

mtail - extract internal monitoring data from application logs for collection into a timeseries database

 mtail - extract internal monitoring data from application logs for collection into a timeseries database

mtail - extract internal monitoring data from application logs for collection into a timeseries database mtail is a tool for extracting metrics from a

Dec 29, 2022

A customized GORM logger that implements the appropriate interface and uses Logrus to output logs

CryptoMath GORM Logger A customized GORM logger that implements the appropriate interface and uses Logrus to output logs. Install go get github.com/ma

Nov 6, 2021

Lumberjack is a Go package for writing logs to rolling files.

Lumberjack is a Go package for writing logs to rolling files.

Feb 24, 2022

Leveled execution logs for Go.

glog Leveled execution logs for Go. This is an efficient pure Go implementation of leveled logs in the manner of the open source C++ package glog. By

Nov 29, 2021

WIP Go Thing to download HCP Vault Logs

Example Go Script to pull HCP Vault Audit Logs WARNING: This makes use of unstable preview APIs which could change at any time! USE AT YOUR OWN PERIL

Feb 6, 2022

Stream logs through websockets, written in Go

Stream logs through websockets, written in Go

Jan 8, 2022

Request-logging-tool - A tool logs the md5 codes of the responses of the given domains in parameter

request-logging-tool Application to send http requests and log the md5 responses

Jan 7, 2022
Goal is to generate logger and tracer wraps around a certain struct
Goal is to generate logger and tracer wraps around a certain struct

Goal is to generate logger and tracer wraps around a certain struct

Feb 13, 2022
A logr LogSink implementation using bytes.Buffer

buflogr A logr LogSink implementation using bytes.Buffer. Usage import ( "bytes" "fmt" "github.com/go-logr/logr" "github.com/tonglil/buflogr" )

Jan 6, 2023
System information collector

Gohai Gohai is a tool which collects an inventory of system information. It aims to implement some parts of features from facter and ohai. It's forked

Mar 19, 2020
Coletor-mpac - Collector of the Public Ministry of Acre (MPAC) using CDP technology

Ministério Público do Acre(MPAC) Esse coletor é baseado na tecnologia Chrome Dev

Jan 28, 2022
Leveled execution logs for Go

glog ==== Leveled execution logs for Go. This is an efficient pure Go implementation of leveled logs in the manner of the open source C++ package h

Dec 24, 2022
Library and program to parse and forward HAProxy logs

haminer Library and program to parse and forward HAProxy logs. Supported forwarder, Influxdb Requirements Go for building from source code git for dow

Aug 17, 2022
The open and composable observability and data visualization platform. Visualize metrics, logs, and traces from multiple sources like Prometheus, Loki, Elasticsearch, InfluxDB, Postgres and many more.
The open and composable observability and data visualization platform. Visualize metrics, logs, and traces from multiple sources like Prometheus, Loki, Elasticsearch, InfluxDB, Postgres and many more.

The open-source platform for monitoring and observability. Grafana allows you to query, visualize, alert on and understand your metrics no matter wher

Jan 3, 2023
raft variant with topology order logs

Safe: A log that is safe if it has been replicated to a quorum, no matter whether or not the committed flag is set on any replica.

May 28, 2022
Very powerful server agent for collecting & sending logs & metrics with an easy-to-use web console.
Very powerful server agent for collecting & sending logs & metrics with an easy-to-use web console.

logkit-community 中文版 Introduce Very powerful server agent for collecting & sending logs & metrics with an easy-to-use web console. logkit-community De

Dec 29, 2022
Like Prometheus, but for logs.
Like Prometheus, but for logs.

Loki: like Prometheus, but for logs. Loki is a horizontally-scalable, highly-available, multi-tenant log aggregation system inspired by Prometheus. It

Dec 30, 2022