A lightweight, cloud-native data transfer agent and aggregator

go.dev reference CII Best Practices

English | 中文

Loggie is a lightweight, high-performance, cloud-native agent and aggregator based on Golang. It supports multiple pipeline and pluggable components:

  • One stack logging solution: supports data transfer, filtering, parsing, alarm, etc
  • Cloud native: native Kubernetes CRD usage
  • Production level: a full range of observability, automatic operation and reliability capabilities

Architecture

Documentation

Setup

User Guide

Reference

License

Apache-2.0

Comments
  • 容器内日志如何采集

    容器内日志如何采集

    需要pod的 yaml 需要做特定的 empty 或者 hostpath 的前提才可以采集到吗? 下面是一个我们的demo yaml

    volumeMounts:
            - mountPath: /home/admin/logs
              name:01606d0d6354456aa546dfbb36d8a764-datadir
              subPath: logs
    
      volumes:
        - name:01606d0d6354456aa546dfbb36d8a764-datadir
          persistentVolumeClaim:
            claimName: >-
              01606d0d6354456aa54
    
  • logconfig elasticsearch报错

    logconfig elasticsearch报错

    {"level":"warn","time":"2022-03-21T17:26:04Z","caller":"/go/src/loggie.io/loggie/pkg/interceptor/retry/interceptor.go:175","message":"interceptor/retry retry buffer size(2) too large"} {"level":"error","time":"2022-03-21T17:26:04Z","caller":"/go/src/loggie.io/loggie/pkg/pipeline/pipeline.go:267","message":"consumer batch fail,err: elasticsearch client not initialized yet"}

    es版本7x

  • goccy/go-yaml deal with blank string uncorrectly

    goccy/go-yaml deal with blank string uncorrectly

    goccy/go-yaml deal with blank string uncorrectly which is introduced by #242. And I met problem with parsing default containerd log.

    CODE

    package main
    
    import (
    	"fmt"
    
    	errYaml "github.com/goccy/go-yaml"
    	okYaml "gopkg.in/yaml.v2"
    )
    
    func main() {
    	v := struct {
    		Key string
    	}{
    		Key: " ",
    	}
    	d1, _ := okYaml.Marshal(v)
    	fmt.Printf("%s\n%s\n", "YES", string(d1))
    
    	d2, _ := errYaml.Marshal(v)
    	fmt.Printf("%s\n%s\n", "NO", string(d2))
    
    }
    

    OUTPUT

    YES
    key: ' '
    
    NO
    key:
    

    In what area(s)?

    /area interceptor

    What version of Loggie?

    v1.2.0+

    Expected Behavior

    Load interceptor successfully

    Actual Behavior

    Got warnning log. get processor error: Key: 'SplitConfig.Separator' Error:Field validation for 'Separator' failed on the 'required' tag.

    Steps to Reproduce the Problem

    1. Config CRD Interceptor
        - type: normalize
          processors:
            - split:
                separator: ' '
                max: 4
                keys: [ "time", "stream", "F", "message" ]
            - drop:
                targets: [ "F", "body" ]
            - rename:
                convert:
                  - from: "message"
                    to: "body"
            - underRoot:
                keys:
                  - kubernetes
    
    1. kubectl delete pod and kubectl logs pod
  • Feat: add zinc sink

    Feat: add zinc sink

    Proposed Changes:

    • add zinc sink

    Which issue(s) this PR fixes:

    Fixes #

    Additional documentation:

    ZincSearch is the simplest and easiest search system to get up and running. It's an open source easy-to-use search engine to solves your observability needs.

    pipelines:
      - name: local
        sources:
          - type: file
            name: demo
            paths:
              - /tmp/log/*.log
            fields:
              topic: "loggie"
        sink:
          type: zinc
          host: "http://127.0.0.1:4080"
          username: admin
          password: Complexpass#123
          index: "demo"
          codec:
            pretty: false
    
  • feat(sink): add pulsar sink

    feat(sink): add pulsar sink

    Proposed Changes:

    • 增加支持pulsar sink

    Which issue(s) this PR fixes:

    Fixes #199

    Additional documentation:

    pulsar

    使用sink kafka将日志数据发送至下游Pulsar。

    !!! example

    ```yaml
    sink:
      type: pulsar
      url: pulsar://localhost:6650
      topic: persistent://tenant/namespace/topic
    ```
    

    brokers

    | 字段 | 类型 | 是否必填 | 默认值 | 含义 | | ---------- | ----------- | ----------- | --------- | -------- | | url| string | 必填 | 无 | 日志发送端pulsar连接地址 |

    topic

    | 字段 | 类型 | 是否必填 | 默认值 | 含义 | | ---------- | ----------- | ----------- | --------- | -------- | | topic | string | 必填 | 无 | 发送日志至pulsar的topic |

    operation_timeout_seconds

    | 字段 | 类型 | 是否必填 | 默认值 | 含义 | | ---------- | ----------- | ----------- | --------- | -------- | | operation_timeout_seconds| time.Duration| 非必填 | 30s | Producer-create, subscribe and unsubscribe operations will be retried until this interval, after which the operation will be marked as failed |

    connectionTimeout

    | 字段 | 类型 | 是否必填 | 默认值 | 含义 | | ---------- | ----------- | ----------- | --------- | -------- | | connectionTimeout| time.Duration| 非必填 | 5s | Timeout for the establishment of a TCP connection |

    sendTimeout

    | 字段 | 类型 | 是否必填 | 默认值 | 含义 | | ---------- | ----------- | ----------- | --------- | -------- | | sendTimeout| time.Duration| 非必填 | 30s | SendTimeout set the timeout for a message that is not acknowledged by the server 30s |

    maxPendingMessages

    | 字段 | 类型 | 是否必填 | 默认值 | 含义 | | ---------- | ----------- | ----------- | --------- | -------- | | sendTimeout| time.Duration| 非必填 | 无 | MaxPendingMessages specifies the max size of the queue holding the messages pending to receive an acknowledgment from the broker |

    hashingSchema

    | 字段 | 类型 | 是否必填 | 默认值 | 含义 | | ---------- | ----------- | ----------- | --------- | -------- | | hashingSchema| int| 非必填 | 0 |HashingScheme is used to define the partition on where to publish a particular message. 0:JavaStringHash,1:Murmur3_32Hash |

    hashingSchema

    | 字段 | 类型 | 是否必填 | 默认值 | 含义 | | ---------- | ----------- | ----------- | --------- | -------- | | hashingSchema| int| 非必填 | 0 |HashingScheme is used to define the partition on where to publish a particular message. 0:JavaStringHash,1:Murmur3_32Hash |

    compressionType

    | 字段 | 类型 | 是否必填 | 默认值 | 含义 | | ---------- | ----------- | ----------- | --------- | -------- | | compressionType | int| 非必填 | 0 | 0:NoCompression,1:LZ4,2:ZLIB,3:ZSTD |

    LogLevel

    | 字段 | 类型 | 是否必填 | 默认值 | 含义 | | ---------- | ----------- | ----------- | --------- | -------- | | LogLevel| string | 非必填 | 0 | 日志级别: "info","debug", "error" |

    batchingMaxSize

    | 字段 | 类型 | 是否必填 | 默认值 | 含义 | | ---------- | ----------- | ----------- | --------- | -------- | | batchingMaxSize| int | 非必填 | 2048(KB) | BatchingMaxSize specifies the maximum number of bytes permitted in a batch |

    batchingMaxMessages

    | 字段 | 类型 | 是否必填 | 默认值 | 含义 | | ---------- | ----------- | ----------- | --------- | -------- | | batchingMaxMessages| int | 非必填 | 1000 |BatchingMaxMessages specifies the maximum number of messages permitted in a batch |

    batchingMaxPublishDelay

    | 字段 | 类型 | 是否必填 | 默认值 | 含义 | | ---------- | ----------- | ----------- | --------- | -------- | | batchingMaxPublishDelay| time.Duration | 非必填 | 10ms | BatchingMaxPublishDelay specifies the time period within which the messages sent will be batched |

    useTLS

    | 字段 | 类型 | 是否必填 | 默认值 | 含义 | | ---------- | ----------- | ----------- | --------- | -------- | | useTLS| bool | 非必填 | false | 是否使用TLS认证 |

    tlsTrustCertsFilePath

    | 字段 | 类型 | 是否必填 | 默认值 | 含义 | | ---------- | ----------- | ----------- | --------- | -------- | | tlsTrustCertsFilePath| string | 非必填 | 无 | the path to the trusted TLS certificate file |

    tlsAllowInsecureConnection

    | 字段 | 类型 | 是否必填 | 默认值 | 含义 | | ---------- | ----------- | ----------- | --------- | -------- | | tlsAllowInsecureConnection| bool| 非必填 | false | Configure whether the Pulsar client accept untrusted TLS certificate from broker |

    certificatePath

    | 字段 | 类型 | 是否必填 | 默认值 | 含义 | | ---------- | ----------- | ----------- | --------- | -------- | | certificatePath| string | 非必填 | 无 | TLS证书路径 |

    privateKeyPath

    | 字段 | 类型 | 是否必填 | 默认值 | 含义 | | ---------- | ----------- | ----------- | --------- | -------- | | privateKeyPath| string | 非必填 | 无 | TLS privateKey路径 |

    token

    | 字段 | 类型 | 是否必填 | 默认值 | 含义 | | ---------- | ----------- | ----------- | --------- | -------- | | token | string| 非必填 | 无 | 如果使用token认证鉴权pulsar,请填写此项|

    tokenFilePath

    | 字段 | 类型 | 是否必填 | 默认值 | 含义 | | ---------- | ----------- | ----------- | --------- | -------- | | tokenFilePath| string| 非必填 | 无 | auth token from a file|

  • The interceptors cannot remove fields such as stream and time

    The interceptors cannot remove fields such as stream and time

    环境:kubernetes1.21 + docker logconfig apiVersion: loggie.io/v1beta1 kind: LogConfig metadata: name: nginx namespace: default spec: pipeline: interceptorsRef: nginx-interce sinkRef: nginx-sink sources: | - type: file name: mylog containerName: nginx fields: topic: "nginx-access" matchFields: labelKey: [app] paths: - stdout selector: labelSelector: app: nginx type: pod Interceptor apiVersion: loggie.io/v1beta1 kind: Interceptor metadata: name: nginx-interce spec: interceptors: | - type: normalize name: stdproc belongTo: ["mylog"] processors: - jsonDecode: target: body - drop: targets: ["stream", "time", "body"] - rename: convert: - from: "log" to: "message" sink apiVersion: loggie.io/v1beta1 kind: Sink metadata: name: nginx-sink spec: sink: | type: dev printEvents: true

    部署完成请求nginx,查看loggie pod日志为: { "fields": { "namespace": "default", "nodename": "10.0.20.28", "podname": "nginx-6799fc88d8-td4sc", "containername": "nginx", "logconfig": "nginx", "topic": "nginx-access" }, "body": "{\"log\":\"10.203.2.0 - - [21/Mar/2022:14:47:44 +0000] \\\"GET / HTTP/1.1\\\" 200 615 \\\"-\\\" \\\"curl/7.29.0\\\" \\\"-\\\"\\n\",\"stream\":\"stdout\",\"time\":\"2022-03-21T14:47:44.246358969Z\"}" } 日志中log字段没有被替换,且stream、time字段未被删除

  • loggie无法写入kafka报错

    loggie无法写入kafka报错

    2022-08-08 19:52:01 ERR pkg/pipeline/pipeline.go:341 > consumer batch failed: write to kafka: kafka write errors (2048/2048)

    Ask your question here:

    image

  • loggie 无法收集kube-event信息

    loggie 无法收集kube-event信息

    Ask your question here:

    版本:loggie-v1.3.0-rc.0 采用inclusterconfig 读取配置文件

    配置如下


    apiVersion: loggie.io/v1beta1 kind: Interceptor metadata: name: jsondecode spec: interceptors: | - type: normalize name: json processors: - jsonDecode: ~ - drop: targets: ["body"]

    apiVersion: loggie.io/v1beta1 kind: ClusterLogConfig metadata: name: kubeevent spec: selector: type: cluster cluster: aggregator pipeline: sources: | - type: kubeEvent name: event interceptorRef: jsondecode sinkRef: k8sbgy-kube-eventer

    无法收集kube-event信息

  • v1.3.0-rc.0版本,【多行日志采集】与【Elasticsearch sink动态索引名称】这两项同时配置时报错

    v1.3.0-rc.0版本,【多行日志采集】与【Elasticsearch sink动态索引名称】这两项同时配置时报错

    What version of Loggie?

    v1.3.0-rc.0

    Expected Behavior

    1、正常采集日志且按照指定正则将多行合并输出到Elasticsearch索引中。 2、loggie后端无报错日志。

    Actual Behavior

    1、Elasticsearch中没有生成指定索引。 2、loggie后端日志报错: 2022-09-03 17:03:28 INF pkg/eventbus/export/logger/logger.go:141 > [metric]: {"filesource":{},"queue":{"public/catalog-channel":{"capacity":2048,"fillPercentage":0,"pipeline":"public/catalog","queueType":"channel","size":0}},"reload":{"ReloadTotal":6},"sink":{}} 2022-09-03 17:03:39 ERR pkg/pipeline/pipeline.go:353 > consumer batch failed: send events to elasticsearch: request to elasticsearch response error: {"errors":true,"items":[{"index":{"_index":"--2022-09","type":"doc","status":400,"error":{"type":"invalid_index_name_exception","reason":"Invalid index name [--2022-09], must not start with '', '-', or '+'","index":"--2022-09"}}}]} 2022-09-03 17:03:39 ERR pkg/pipeline/pipeline.go:353 > consumer batch failed: send events to elasticsearch: request to elasticsearch response error: {"errors":true,"items":[{"index":{"_index":"--2022-09","type":"doc","status":400,"error":{"type":"invalid_index_name_exception","reason":"Invalid index name [--2022-09], must not start with '', '-', or '+'","index":"--2022-09"}}}]} 2022-09-03 17:03:45 WRN pkg/interceptor/retry/interceptor.go:191 > interceptor/retry retry buffer size(2) too large 2022-09-03 17:03:45 INF pkg/interceptor/retry/interceptor.go:214 > next retry duration: 690ms 2022-09-03 17:03:45 ERR pkg/pipeline/pipeline.go:353 > consumer batch failed: send events to elasticsearch: request to elasticsearch response error: {"errors":true,"items":[{"index":{"_index":"--2022-09","type":"doc","status":400,"error":{"type":"invalid_index_name_exception","reason":"Invalid index name [--2022-09], must not start with '', '-', or '+'","index":"--2022-09"}}}]}

    Steps to Reproduce the Problem

    1、部署loggie时,增加discovery.kubernetes.typePodFields: typePodFields: logconfig: "${_k8s.logconfig}" namespace: "${_k8s.pod.namespace}" workloadkind: "${_k8s.workload.kind}" workloadname: "${_k8s.workload.name}" nodename: "${_k8s.node.name}" nodeip: "${_k8s.node.ip}" poduid: "${_k8s.pod.uid}" podname: "${_k8s.pod.name}" podip: "${_k8s.pod.ip}" containerid: "${_k8s.pod.container.id}" containername: "${_k8s.pod.container.name}" containerimage: "${_k8s.pod.container.image}" 2、创建LogConfig,source配置多行采集,sink配置动态索引名称: apiVersion: loggie.io/v1beta1 kind: LogConfig metadata: name: catalog namespace: public spec: selector: type: pod labelSelector: app: catalog release: catalog pipeline: sources: | - type: file name: logfile addonMeta: true paths: - /catalog/logs/*.log multi: active: true pattern: "^[0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}:[0-9]{2}:[0-9]{2}.[0-9]{3}" sink: | type: elasticsearch hosts: ["192.168.0.1:9200"] index: "${fields.workloadname}-${fields.namespace}-${+YYYY-MM}" etype: "doc" codec: type: json beatsFormat: true

  • pod 批量重建后,新pod日志未采集

    pod 批量重建后,新pod日志未采集

    What version of Loggie?

    1.2.0

    Expected Behavior

    某 logconfig 匹配的 label 为 workload.user.cattle.io/workloadselector=statefulSet-game-s6-game-core-v4-cluster

    image

    之前日志可以正常采集,但是用户有过pod 重建 pod 重建后,因为原pod 路径不存在,能看到 loggie中的异常日志

    image

    在loggie 容器里 /var/log/pods 路径能看到新的pod,但是新的pod 日志也无法采集

    describe logconfig 发现 events 为none:

    image

    但是 label Selector 是没有问题的,之前日志可以正常采集,也没有动过,get pod 也可以匹配到新的pod:

    image

    Actual Behavior

    新pod 日志无法采集

    Steps to Reproduce the Problem

  • interceptors中使用transformer执行copy action,结果会变成base64编码

    interceptors中使用transformer执行copy action,结果会变成base64编码

    发现用regex处理后,原始字段没了,就想到先copy一份,出来,用res来分隔,后来发现这样不行

    interceptors: | - type: rateLimit qps: 100000 - type: transformer actions: - action: copy(body, res) - action: regex(res) pattern: '(?

    我把配置改成只保留 - action: copy(body, res)

    在es上查看

    image

    发现copy 后的res字段被base64编码了,请问这是一个BUG还是我配置的问题?

  • 容器内日志采集,是否有相关示例文档?

    容器内日志采集,是否有相关示例文档?

    按照官网配置教程,sink配置为kafka,dev,zinc都无法采集到容器内日志?loggie配置没任何报错,不清楚是哪块出问题了。

    环境

    • k8s: 1.24.3
    • containerd: 1.6.4
    • loggie版本: v1.4.0-rc.0

    sink配置

    apiVersion: loggie.io/v1beta1
    kind: Sink
    metadata:
      name: dev
    spec:
      sink: |
        type: dev
        printEvents: true
        codec:
          type: raw
    

    interceptor配置

    apiVersion: loggie.io/v1beta1
    kind: Interceptor
    metadata:
      name: default
    spec:
      interceptors: |
        - type: rateLimit
          qps: 90000
    

    日志路径示例如下

    /app/game/logs/dayreport/dayreport_2023-01-04.log
    /app/game/logs/dayreport/dayreport_2023-01-05.log
    

    logConfig配置

    apiVersion: loggie.io/v1beta1
    kind: LogConfig
    metadata:
      name: game
    spec:
      selector:
        type: pod
        labelSelector:
          app: game
      pipeline:
        sources: |
          - type: file
            name: game_dayreport
            fields:
              _log_type_: dayreport
            paths:
              - /app/game/logs/dayreport/dayreport_*.log     
        sinkRef: dev
        interceptorRef: default
    
  • Feat new db

    Feat new db

    Proposed Changes:

    • move persistence from /source/file to a single package
    • move db config to loggie.yml
    • replace sqlite with badger

    Which issue(s) this PR fixes:

    Fixes #

    Additional documentation:

    
    
  • Feat journal source

    Feat journal source

    Proposed Changes:

    • journal source (specific image with systemd needed)

    Which issue(s) this PR fixes:

    Fixes #

    Additional documentation:

    
    
  • LogConfig spec.selector能否支持或表达式

    LogConfig spec.selector能否支持或表达式

provide api for cloud service like aliyun, aws, google cloud, tencent cloud, huawei cloud and so on

cloud-fitter 云适配 Communicate with public and private clouds conveniently by a set of apis. 用一套接口,便捷地访问各类公有云和私有云 对接计划 内部筹备中,后续开放,有需求欢迎联系。 开发者社区 开发者社区文档

Dec 20, 2022
Igo Agent is the agent of Igo, a command-line tool, through which you can quickly start Igo

igo agent 英文 | 中文 Igo Agent is the agent of Igo, a command-line tool, through which you can quickly start Igo, and other capabilities may be added lat

Dec 22, 2021
Shoes-agent - Framework for myshoes provider using agent
Shoes-agent - Framework for myshoes provider using agent

shoes-agent Framework for myshoes provider using agent. agent: agent for shoes-a

Jan 8, 2022
Feb 17, 2022
Integrated ssh-agent for windows. (pageant compatible. openSSH ssh-agent etc ..)
Integrated ssh-agent for windows. (pageant compatible. openSSH ssh-agent etc ..)

OmniSSHAgent About The chaotic windows ssh-agent has been integrated into one program. Chaos Map of SSH-Agent on Windows There are several different c

Dec 19, 2022
Open Service Mesh (OSM) is a lightweight, extensible, cloud native service mesh that allows users to uniformly manage, secure, and get out-of-the-box observability features for highly dynamic microservice environments.
Open Service Mesh (OSM) is a lightweight, extensible, cloud native service mesh that allows users to uniformly manage, secure, and get out-of-the-box observability features for highly dynamic microservice environments.

Open Service Mesh (OSM) Open Service Mesh (OSM) is a lightweight, extensible, Cloud Native service mesh that allows users to uniformly manage, secure,

Jan 2, 2023
Cloud-Z gathers information and perform benchmarks on cloud instances in multiple cloud providers.

Cloud-Z Cloud-Z gathers information and perform benchmarks on cloud instances in multiple cloud providers. Cloud type, instance id, and type CPU infor

Jun 8, 2022
Substation is a cloud native toolkit for building modular ingest, transform, and load (ITL) data pipelines

Substation Substation is a cloud native data pipeline toolkit. What is Substation? Substation is a modular ingest, transform, load (ITL) application f

Dec 30, 2022
AWS Data Transfer Cost Explorer
AWS Data Transfer Cost Explorer

The AWS Data Transfer Cost Explorer The AWS Data Transfer Cost Explorer tool analyzes the billed Data Transfer items in your AWS account and presents

Jul 18, 2022
go-opa-validate is an open-source lib that evaluates OPA (open policy agent) policy against JSON or YAML data.
go-opa-validate is an open-source lib that evaluates OPA (open policy agent) policy against JSON or YAML data.

go-opa-validate go-opa-validate is an open-source lib that evaluates OPA (open policy agent) policy against JSON or YAML data. Installation Usage Cont

Nov 17, 2022
Polaris is a cloud-native service discovery and governance center

It can be used to solve the problem of service connection, fault tolerance, traffic control and secure in distributed and microservice architecture.

Dec 26, 2022
The OCI Service Operator for Kubernetes (OSOK) makes it easy to connect and manage OCI services from a cloud native application running in a Kubernetes environment.

OCI Service Operator for Kubernetes Introduction The OCI Service Operator for Kubernetes (OSOK) makes it easy to create, manage, and connect to Oracle

Sep 27, 2022
Cloud Native Electronic Trading System built on Kubernetes and Knative Eventing

Ingenium -- Still heavily in prototyping stage -- Ingenium is a cloud native electronic trading system built on top of Kubernetes and Knative Eventing

Aug 29, 2022
🔥 🔥 Open source cloud native security observability platform. Linux, K8s, AWS Fargate and more. 🔥 🔥
🔥 🔥   Open source cloud native security observability platform. Linux, K8s, AWS Fargate and more. 🔥 🔥

CVE-2021-44228 Log4J Vulnerability can be detected at runtime and attack paths can be visualized by ThreatMapper. Live demo of Log4J Vulnerability her

Jan 1, 2023
The Cloud Native Application Proxy
The Cloud Native Application Proxy

Traefik (pronounced traffic) is a modern HTTP reverse proxy and load balancer that makes deploying microservices easy. Traefik integrates with your ex

Jan 9, 2023
Kubernetes Operator for a Cloud-Native OpenVPN Deployment.

Meerkat is a Kubernetes Operator that facilitates the deployment of OpenVPN in a Kubernetes cluster. By leveraging Hashicorp Vault, Meerkat securely manages the underlying PKI.

Jan 4, 2023
Zadig is a cloud native, distributed, developer-oriented continuous delivery product.

Zadig Developer-oriented Continuous Delivery Product English | 简体中文 Table of Contents Zadig Table of Contents What is Zadig Quick start How to use? Ho

Jan 8, 2023
Zadig is a cloud native, distributed, developer-oriented continuous delivery product.

Zadig Developer-oriented Continuous Delivery Product ⁣ English | 简体中文 Table of Contents Zadig Table of Contents What is Zadig Quick start How to use?

May 12, 2021
Interactive Cloud-Native Environment Client
Interactive Cloud-Native Environment Client

Fenix-CLI:Interactive Cloud-Native Environment Client English | 简体中文 Fenix-CLI is an interactive cloud-native operating environment client. The goal i

Dec 15, 2022