Prometheus Common Data Exporter can parse JSON, XML, yaml or other format data from various sources (such as HTTP response message, local file, TCP response message and UDP response message) into Prometheus metric data.

Last update: May 18, 2022

Comments: 9

Prometheus Common Data Exporter

Prometheus Common Data Exporter 用于将多种来源(如http响应报文、本地文件、TCP响应报文、UDP响应报文)的Json、xml、yaml或其它格式的数据，解析为Prometheus metric数据。

Prometheus Common Data Exporter is used to parse JSON, XML, yaml or other format data from multiple sources (such as HTTP response message, local file, TCP response message and UDP response message) into Prometheus metric data.

English | 简体中文

编译

通用

make

编译Docker镜像

make && docker build -t data_exporter:0.2.0 .

运行

常规启动

./data_exporter --config.file="data_exporter.yaml"

调试配置文件

./data_exporter --config.file="data_exporter.yaml" --log.level=debug

启动examples

cd examples
nohup python3 -m http.server -b 127.0.0.1 10101 &  # 启动一个http后台服务, 测试结束记得停止
../data_exporter
# 新窗口执行 
curl 127.0.0.1:9116/metrics

使用Docker运行

docker run --rm -d -p 9116:9116 --name data_exporter -v `pwd`:/etc/data_exporter/ microops/data_exporter:0.2.0 --config.file=/etc/data_exporter/config.yml

配置

collects:
  - name: "test-http"
    relabel_configs: [ ]
    data_format: "json" # 原数据格式/数据匹配模式
    datasource:
      - type: "file"
        url: "../examples/my_data.json"
      - type: "http"
        url: "https://localhost/examples/my_data.json"
        relabel_configs: [ ]
    metrics: # metric 匹配规则
      - name: "Point1"
        relabel_configs: # 根据匹配到数据及标签，进行二次处理，和Prometheus的relabel_configs用法一致
          - source_labels: [ __name__ ]
            target_label: name
            regex: "([^.]+)\\.metrics\\..+"
            replacement: "$1"
            action: replace
          - source_labels: [ __name__ ]
            target_label: __name__
            regex: "[^.]+\\.metrics\\.(.+)"
            replacement: "server_$1"
            action: replace
        match: # 匹配规则
          datapoint: "data|@expand|@expand|@to_entries:name:value" # 数据块匹配，每一个数据块就是一个指标的原始数据
          labels: # 标签匹配
            __value__: "value"
            __name__: "name"

流程

数据源

file

# 数据源名称 max_content_length: 
# 读取最大长度，单位为字节，默认为102400000 relabel_configs: [

, ... ] # 参考https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config timeout:

# 默认为30s，不能小于1ms，参考https://prometheus.io/docs/prometheus/latest/configuration/configuration/#duration read_mode:

# 读取模式，stream-line|full-text，默认为full-text url: "../examples/weather.xml" ">
datasource:
  - type: "file"
    name: 
        
       # 数据源名称
    max_content_length: 
        
       # 读取最大长度，单位为字节，默认为102400000
    relabel_configs: [ 
       
        , ... ] 
       # 参考https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config
    timeout: 
         
       # 默认为30s，不能小于1ms，参考https://prometheus.io/docs/prometheus/latest/configuration/configuration/#duration
    read_mode: 
        
       # 读取模式，stream-line|full-text，默认为full-text
    url: "../examples/weather.xml"

http

# 数据源名称 max_content_length: 
# 读取最大长度，单位为字节，默认为102400000 relabel_configs: [

, ... ] # 参考https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config timeout:

# 默认为30s，不能小于1ms，参考https://prometheus.io/docs/prometheus/latest/configuration/configuration/#duration read_mode:

# 读取模式，stream-line|full-text，默认为full-text url: "http://127.0.0.1:2001/weather.xml" http: # HTTP basic 认证信息 basic_auth: username:

password:

password_file:

# `Authorization` 头配置 authorization: type:

# 类型，默认为 Bearer credentials:

credentials_file:

oauth2:

# oauth2配置，参考文档: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#oauth2 proxy_url:

# 代理地址 follow_redirects:

# 是否跟随重定向，默认为true tls_config:

# TLS配置 参考文档: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#tls_config body: string # HTTP请求报文 headers: {

:

, ... } # 自定义HTTP头 method:

#HTTP请求方法 GET/POST/PUT... valid_status_codes: [

,... ] # 有效的状态码,默认为200~299 ">
datasource:
  - type: "http"
    name: 
                      
                     # 数据源名称
    max_content_length: 
                      
                     # 读取最大长度，单位为字节，默认为102400000
    relabel_configs: [ 
                     
                      , ... ] 
                     # 参考https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config
    timeout: 
                       
                     # 默认为30s，不能小于1ms，参考https://prometheus.io/docs/prometheus/latest/configuration/configuration/#duration
    read_mode: 
                      
                     # 读取模式，stream-line|full-text，默认为full-text
    url: "http://127.0.0.1:2001/weather.xml"
    http:
      # HTTP basic 认证信息
      basic_auth:
        username: 
                     
        password: 
                     
        password_file: 
                     
      # `Authorization` 头配置
      authorization:
        type: 
                      
                     # 类型，默认为 Bearer
        credentials: 
                     
        credentials_file: 
                     
      oauth2: 
                      
                     # oauth2配置，参考文档: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#oauth2
      proxy_url: 
                      
                     # 代理地址
      follow_redirects: 
                      
                     # 是否跟随重定向，默认为true
      tls_config: 
                      
                     # TLS配置 参考文档: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#tls_config
      body: string # HTTP请求报文
      headers: { 
                     
                      : 
                      
                       , ... } 
                      
                     # 自定义HTTP头
      method: 
                      
                     #HTTP请求方法 GET/POST/PUT...
      valid_status_codes: [ 
                     
                      ,... ] 
                     # 有效的状态码,默认为200~299

tcp

# 数据源名称 max_content_length: 
# 读取最大长度，单位为字节，默认为102400000 relabel_configs: [

, ... ] # 参考https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config timeout:

# 默认为30s，不能小于1ms，参考https://prometheus.io/docs/prometheus/latest/configuration/configuration/#duration read_mode:

# 读取模式，stream-line|full-text，默认为full-text url: "127.0.0.1:2001" tcp: tls_config:

# TLS配置 参考文档: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#tls_config send: # send的值类型可以为 string、[string,...]、{"msg":

,"delay":

}、[{"msg":

,"delay":

},...] - msg:

# 发送消息 delay:

# 发送后等待时间，默认为0，延迟总和不得大于timeout，参考https://prometheus.io/docs/prometheus/latest/configuration/configuration/#duration max_connect_time:

# 最大建立连接的时长（不包含数据传输），如果超过该时间连接仍未建立成功，会返回失败。默认为3秒 max_transfer_time:

# 报文传输最大时长，报文传输超过该时长，会停止继续读取并关闭连接。 end_of: # 报文结束标志，当读取到该标志，则会停止继续读取并关闭连接。报文为行缓冲，所以end_of的值不能为多行。 ">
datasource:
  - type: "tcp"
    name: 
                 
                # 数据源名称
    max_content_length: 
                 
                # 读取最大长度，单位为字节，默认为102400000
    relabel_configs: [ 
                
                 , ... ] 
                # 参考https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config
    timeout: 
                  
                # 默认为30s，不能小于1ms，参考https://prometheus.io/docs/prometheus/latest/configuration/configuration/#duration
    read_mode: 
                 
                # 读取模式，stream-line|full-text，默认为full-text
    url: "127.0.0.1:2001"
    tcp:
      tls_config: 
                 
                # TLS配置 参考文档: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#tls_config
      send: # send的值类型可以为 string、[string,...]、{"msg": 
                
                 ,"delay": 
                 
                  }、[{"msg": 
                  
                   ,"delay": 
                   
                    },...]
                   
                  
                 
                
        - msg: 
                  
                # 发送消息
          delay: 
                  
                # 发送后等待时间，默认为0，延迟总和不得大于timeout，参考https://prometheus.io/docs/prometheus/latest/configuration/configuration/#duration
      max_connect_time: 
                 
                # 最大建立连接的时长（不包含数据传输），如果超过该时间连接仍未建立成功，会返回失败。默认为3秒
      max_transfer_time: 
                 
                # 报文传输最大时长，报文传输超过该时长，会停止继续读取并关闭连接。
      end_of: # 报文结束标志，当读取到该标志，则会停止继续读取并关闭连接。报文为行缓冲，所以end_of的值不能为多行。

注：end_of和max_transfer_time用来控制关闭连接(报文传输完成)。当匹配到end_of的标志，或传输时间达到max_transfer_time的值，会关闭连接，停止接收数据，但不会抛出异常。建议主要使用end_of来控制，并增大max_transfer_time的值。

udp

# 数据源名称 max_content_length: 
# 读取最大长度，单位为字节，默认为102400000 relabel_configs: [

, ... ] # 参考https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config timeout:

# 默认为30s，不能小于1ms，参考https://prometheus.io/docs/prometheus/latest/configuration/configuration/#duration read_mode:

# 读取模式，stream-line|full-text，默认为full-text url: "127.0.0.1:2001" udp: send: # send的值类型可以为 string、[string,...]、{"msg":

,"delay":

}、[{"msg":

,"delay":

},...] - msg:

# 发送消息 delay:

# 发送后等待时间，默认为0，延迟总和不得大于timeout，参考https://prometheus.io/docs/prometheus/latest/configuration/configuration/#duration max_connect_time:

# 最大建立连接的时长（不包含数据传输），如果超过该时间连接仍未建立成功，会返回失败。默认为3秒 max_transfer_time:

# 报文传输最大时长，报文传输超过该时长，会停止继续读取并关闭连接。 end_of: # 报文结束标志，当读取到该标志，则会停止继续读取并关闭连接。报文为行缓冲，所以end_of的值不能为多行。 ">
datasource:
  - type: "udp"
    name: 
                
               # 数据源名称
    max_content_length: 
                
               # 读取最大长度，单位为字节，默认为102400000
    relabel_configs: [ 
               
                , ... ] 
               # 参考https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config
    timeout: 
                 
               # 默认为30s，不能小于1ms，参考https://prometheus.io/docs/prometheus/latest/configuration/configuration/#duration
    read_mode: 
                
               # 读取模式，stream-line|full-text，默认为full-text
    url: "127.0.0.1:2001"
    udp:
      send: # send的值类型可以为 string、[string,...]、{"msg": 
               
                ,"delay": 
                
                 }、[{"msg": 
                 
                  ,"delay": 
                  
                   },...]
                  
                 
                
               
        - msg: 
                 
               # 发送消息
          delay: 
                 
               # 发送后等待时间，默认为0，延迟总和不得大于timeout，参考https://prometheus.io/docs/prometheus/latest/configuration/configuration/#duration
      max_connect_time: 
                
               # 最大建立连接的时长（不包含数据传输），如果超过该时间连接仍未建立成功，会返回失败。默认为3秒
      max_transfer_time: 
                
               # 报文传输最大时长，报文传输超过该时长，会停止继续读取并关闭连接。
      end_of: # 报文结束标志，当读取到该标志，则会停止继续读取并关闭连接。报文为行缓冲，所以end_of的值不能为多行。

注: udp暂不支持TLS

Labels说明

总体遵循prometheus的规范, 但包含几个额外的特殊的label:

__namespace__、__subsystem__、__name__
- __namespace__、__subsystem__ 的值为可选项
- __name__ 的值为必选项
- __namespace__、__subsystem__、__name__使用下划线进行连接，组成metric的fqDN（metric name）
__value__: 必选， metric值
__time_format__、__time__
- __time_format__的值为可选项
- __time__ 的值为可选项，如果只为空或未匹配到时间戳，则对应的metric数据不会携带时间
- __time__ 的值为unix（秒、毫秒或纳秒）时间戳(字符串)时，不需要指定__time_format__
- __time__ 的值为 RFC3339Nano（兼容RFC3339）格式的时间字符串时，不需要指定__time_format__
- __time__ 的值为其它格式的时间字符串时，需要指定__time_format__（参考 go源代码）
__help__: 可选，Metric帮助信息

relabel_configs

参考Prometheus官方文档 relabel_config

Metric匹配语法

datapoint: 数据点/块匹配，每一个数据点/块就是一个指标的原始数据
- 如果值为空，则匹配全部数据
labels: map类型，key为label key, value为匹配到的label value，如果有多个结果，只会获取第一个结果

数据匹配模式

json

示例

数据

{
  "code": 0,
  "data": {
    "server1": {
      "metrics": {
        "CPU": "16",
        "Memory": 68719476736
      }
    },
    "server2": {
      "metrics": {
        "CPU": "8",
        "Memory": 34359738368
      }
    }
  }
}

配置

match: # 匹配规则
  datapoint: "data|@expand|@expand|@to_entries:name:value"
  labels:
    __value__: "value"
    __name__: "name"

说明

总体遵循 gjson 语法
增加 modifiers: expand
- 将map展开一层，具体说明见下文
增加 modifiers: to_entries
- 将map转换为array，具体说明见下文

expand

原始数据:

{
  "server1": {
    "metrics": {
      "CPU": "16",
      "Memory": 68719476736
    }
  },
  "server2": {
    "metrics": {
      "CPU": "8",
      "Memory": 34359738368
    }
  }
}

使用@expand展开后数据:

{
  "server1.metrics": {
    "CPU": "16",
    "Memory": 68719476736
  },
  "server2.metrics": {
    "CPU": "8",
    "Memory": 34359738368
  }
}

to_entries

原始数据:

{
  "server1": {
    "metrics": {
      "CPU": "16",
      "Memory": 68719476736
    }
  },
  "server2": {
    "metrics": {
      "CPU": "8",
      "Memory": 34359738368
    }
  }
}

使用@to_entries展开后数据:

[
  {
    "key": "server1",
    "value": {
      "metrics": {
        "CPU": "16",
        "Memory": 68719476736
      }
    }
  },
  {
    "key": "server2",
    "value": {
      "metrics": {
        "CPU": "8",
        "Memory": 34359738368
      }
    }
  }
]

使用@to_entries:name:val展开后数据:

[
  {
    "name": "server1",
    "val": {
      "metrics": {
        "CPU": "16",
        "Memory": 68719476736
      }
    }
  },
  {
    "name": "server2",
    "val": {
      "metrics": {
        "CPU": "8",
        "Memory": 34359738368
      }
    }
  }
]

使用@to_entries:-:val展开后数据:

[
  {
    "metrics": {
      "CPU": "16",
      "Memory": 68719476736
    }
  },
  {
    "metrics": {
      "CPU": "8",
      "Memory": 34359738368
    }
  }
]

使用@to_entries::-展开后数据:

[
  "server1",
  "server2"
]

yaml

内部会将yaml转换为json，再进行处理，请参考json部分

xml

基于 etree库进行xml解析，

配置:

- name: "weather - week"
  match:
    datapoint: "//china[@dn='week']/city/weather"
    labels:
      __value__: "{{ .Text }}"
      name: '{{ ((.FindElement "../").SelectAttr "quName").Value }}'
      __name__: "week"
      path: "{{ .GetPath }}"

配置说明
- datapoint: 使用 etree.Element.FindElements 进行文档查找，
- labels: 使用go template语法，进行数据解析，元数据为 etree.Element 对象

regex

Perl语法的正则表达式匹配

- name: "server cpu"
  relabel_configs:
    - source_labels: [ __raw__ ]
      target_label: __value__
      regex: ".*cpu=(.+?)[!/].*"
    - source_labels: [ __raw__ ]
      target_label: name
      regex: ".*@\\[(.+?)].*"
    - target_label: __name__
      replacement: "cpu"
  match:
    datapoint: "@.*!"
    labels:
      __raw__: ".*"

如果想跨行匹配，需要使用(?s:.+)这种方式，标记s为让.支持换行(\n)

命名分组匹配

- name: regex - memory
  relabel_configs:
    - target_label: __name__
      replacement: memory
  match:
    datapoint: '@\[(?P
   
    .+?)].*/ts=(?P<__time__>[0-9]+)/.*!
    '
   
    labels:
      __value__: memory=(?P<__value__>[\d]+)

labels使用命名匹配时，需要名称和label名称一致，否则会匹配到整个结果

Owner

https://github.com/MicroOps-cn/data_exporter

Comments

Restrictinf xml file size

I want to read the xml file whose size is above 600 bytes. so i modifed code as below, but it is not giving expected result. can you please suggest on this. File name : datasource.go

func (d *Datasource) ReadAll(ctx context.Context) ([]byte, error) { var reader io.Reader rc, err := d.GetStream(ctx) if err != nil { return nil, err } defer rc.Close() reader = io.LimitReader(rc, *d.MaxContentLength) if reader > 600 { return ioutil.ReadAll(reader) } }
XML data extraction

I am sending my pmdata.xml file and data_exporter.yaml in attachment,

<--------------------------> 101 200 300 90 30 files.zip](https://github.com/MicroOps-cn/data_exporter/files/9007573/files.zip)

<------------------------> I have to extract each values from measResults from the able shown values from pmdata.xml file ,i,e 101 200 300 90 30 Please help how to extract value like this.
Container not working

HI,

While using it as docker container. I have got the images from "docker.io/microops/data_exporter". but here when i try to run container using this command as you mentioned in this docker command: "docker run --rm -d -p 9116:9116 --name data_exporter -v pwd:/etc/data_exporter/ microops/data_exporter:0.2.0 --config.path=/etc/data_exporter/config.yml" nothing will happened container will be in exited mode.

Can you please brief which files i have to put in "/etc/data_exporter/" and what has to be written in "config.path=/etc/data_exporter/config.yml"

Thanks for your support.

Regards, Deepankar

如何动态更改datasource的url地址？

collects:
  - name: "test-http"
    relabel_configs: [ ]
    data_format: "json" # 原数据格式/数据匹配模式
    datasource:
      - type: "file"
        url: "../examples/my_data.json"
      - type: "http"
        url: "https://localhost/examples/my_data.json"
        relabel_configs: [ ]
    metrics: # metric 匹配规则
      - name: "Point1"
        relabel_configs: # 根据匹配到数据及标签，进行二次处理，和Prometheus的relabel_configs用法一致
          - source_labels: [ __name__ ]
            target_label: name

datasource里面的relabel_configs是否有能力根据调用参数改变当前datasource的url？我们在Prometheus里面使用blackbox exporter时，通过使用下面的relabelling，最终探测的是http://prometheus.io、https://prometheus.io、http://example.com:8080 这些targets的黑盒指标，在data_exporter里面该如何动态修改这个datasource的url呢？

scrape_configs:
  - job_name: 'blackbox'
    metrics_path: /probe
    params:
      module: [http_2xx]  # Look for a HTTP 200 response.
    static_configs:
      - targets:
        - http://prometheus.io    # Target to probe with http.
        - https://prometheus.io   # Target to probe with https.
        - http://example.com:8080 # Target to probe with http on port 8080.
    relabel_configs:
      - source_labels: [__address__]
        target_label: __param_target
      - source_labels: [__param_target]
        target_label: instance
      - target_label: __address__
        replacement: 127.0.0.1:9115  # The blackbox exporter's real hostname:port.

每个metric的值不一定都是数字，希望可以处理其他情况。

比如 {"jsonrpc":"2.0","id":1,"result":"0xd8ef11"} 这种json数据，想把result的值提取出来保存现在用的strconv.ParseFloat(val, 64)会报错误。

我看json_exporter的实现 https://github.com/prometheus-community/json_exporter/blob/c487740bb83f5b2a682d99161cbe8e4209ba4b2e/exporter/util.go#L41 可以参考做改造使用吧。

func SanitizeValue(s string) (float64, error) {
	var err error
	var value float64
	var resultErr string

	if value, err = strconv.ParseFloat(s, 64); err == nil {
		return value, nil
	}
	resultErr = fmt.Sprintf("%s", err)

	if intValue, err := strconv.ParseInt(s, 0, 64); err == nil {
		return float64(intValue), nil
	}
	resultErr = resultErr + "; " + fmt.Sprintf("%s", err)

	if boolValue, err := strconv.ParseBool(s); err == nil {
		if boolValue {
			return 1.0, nil
		}
		return 0.0, nil
	}
	resultErr = resultErr + "; " + fmt.Sprintf("%s", err)

	if s == "<nil>" {
		return math.NaN(), nil
	}
	return value, fmt.Errorf(resultErr)
}

How to extract data from multiple data files

I have to read multiple data files. Please help how can i do that. Example : There are pmdata1.xml, pmdata2.xml,pmdata3.xml in this location /data_exporter/tree/master/examples/xml_3gpp_ts How it can be done
Compilation Error

step-1: make common-build Success Step-2: make Error

make

checking code style checking license header running golangci-lint GO111MODULE=on go list -e -compiled -test=true -export=false -deps=true -find=false -tags= -- ./... > /dev/null GO111MODULE=on /root/go/bin/golangci-lint run ./... pkg/logs/logs.go:81:18: undeclared name: kingpin (typecheck) func AddFlags(a *kingpin.Application, config *Config) { ^ pkg/logs/logs.go:23:2: "gopkg.in/alecthomas/kingpin.v2" imported but not used (typecheck) "gopkg.in/alecthomas/kingpin.v2" ^ main.go:125:24: expected expression (typecheck) metrics := wrapper.M[]byte ^ collector/yaml.go:18:2: could not import encoding (-: could not load export data: cannot import "encoding" (unknown iexport format version 2), export data is newer version - update tool) (typecheck) "encoding" ^ collector/yaml.go:27:2: could not import unicode (-: could not load export data: cannot import "unicode" (unknown iexport format version 2), export data is newer version - update tool) (typecheck) "unicode" ^ collector/yaml.go:28:2: could not import unicode/utf8 (-: could not load export data: cannot import "unicode/utf8" (unknown iexport format version 2), export data is newer version - update tool) (typecheck) "unicode/utf8" ^ collector/collect.go:91:46: undeclared name: yaml (typecheck) func (c *CollectConfig) UnmarshalYAML(value *yaml.Node) error { ^ collector/datasource.go:124:43: undeclared name: yaml (typecheck) func (s *SendConfig) UnmarshalYAML(value *yaml.Node) error { ^ collector/datasource.go:140:44: undeclared name: yaml (typecheck) func (s *SendConfigs) UnmarshalYAML(value *yaml.Node) error { ^ collector/flags.go:18:24: undeclared name: kingpin (typecheck) func AddFlags(flagSet *kingpin.Application) { ^ collector/labels.go:233:16: cfg.Regex.MatchString undefined (type Regexp has no field or method MatchString) (typecheck) if cfg.Regex.MatchString(val) { ^ collector/labels.go:237:17: cfg.Regex.MatchString undefined (type Regexp has no field or method MatchString) (typecheck) if !cfg.Regex.MatchString(val) { ^ collector/labels.go:247:24: cfg.Regex.FindStringSubmatchIndex undefined (type Regexp has no field or method FindStringSubmatchIndex) (typecheck) indexes := cfg.Regex.FindStringSubmatchIndex(val) ^ collector/labels.go:252:39: cfg.Regex.ExpandString undefined (type Regexp has no field or method ExpandString) (typecheck) target := model.LabelName(cfg.Regex.ExpandString([]byte{}, cfg.TargetLabel, val, indexes)) ^ collector/labels.go:257:20: cfg.Regex.ExpandString undefined (type Regexp has no field or method ExpandString) (typecheck) res := cfg.Regex.ExpandString([]byte{}, cfg.Replacement, val, indexes) ^ collector/labels.go:268:17: cfg.Regex.MatchString undefined (type Regexp has no field or method MatchString) (typecheck) if cfg.Regex.MatchString(l.Name) { ^ collector/labels.go:269:22: cfg.Regex.ReplaceAllString undefined (type Regexp has no field or method ReplaceAllString) (typecheck) res := cfg.Regex.ReplaceAllString(l.Name, cfg.Replacement) ^ collector/labels.go:494:9: undeclared name: xxhash (typecheck) h := xxhash.New() ^ collector/labels.go:510:9: undeclared name: xxhash (typecheck) return xxhash.Sum64(b) ^ collector/labels.go:532:9: undeclared name: xxhash (typecheck) return xxhash.Sum64(b), b ^ collector/net.go:63:7: c.Close undefined (type *ConnReader has no field or method Close) (typecheck) c.Close() ^ collector/net_test.go:53:13: conn.Close undefined (type connect has no field or method Close) (typecheck) defer conn.Close() ^ collector/net_test.go:59:20: conn.Write undefined (type connect has no field or method Write) (typecheck) _, err := conn.Write([]byte(fmt.Sprintf("%s\n", s))) ^ collector/collect.go:25:2: "gopkg.in/yaml.v3" imported but not used (typecheck) "gopkg.in/yaml.v3" ^ collector/datasource.go:25:2: "gopkg.in/yaml.v3" imported but not used (typecheck) "gopkg.in/yaml.v3" ^ collector/flags.go:16:8: "gopkg.in/alecthomas/kingpin.v2" imported but not used (typecheck) import "gopkg.in/alecthomas/kingpin.v2" ^ collector/labels.go:21:2: "github.com/cespare/xxhash/v2" imported but not used (typecheck) "github.com/cespare/xxhash/v2" ^ collector/labels.go:23:2: "gopkg.in/yaml.v3" imported but not used (typecheck) "gopkg.in/yaml.v3" ^ collector/metric.go:27:2: "math" imported but not used (typecheck) "math" ^ config/config.go:91:5: sc.Lock undefined (type *SafeConfig has no field or method Lock) (typecheck) sc.Lock() ^ config/config.go:92:11: sc.Unlock undefined (type *SafeConfig has no field or method Unlock) (typecheck) defer sc.Unlock() ^ config/config.go:98:5: sc.Lock undefined (type *SafeConfig has no field or method Lock) (typecheck) sc.Lock() ^ config/config.go:99:11: sc.Unlock undefined (type *SafeConfig has no field or method Unlock) (typecheck) defer sc.Unlock() ^ config/config.go:103:5: sc.Lock undefined (type *SafeConfig has no field or method Lock) (typecheck) sc.Lock() ^ config/config.go:104:11: sc.Unlock undefined (type *SafeConfig has no field or method Unlock) (typecheck) defer sc.Unlock() ^ pkg/wrapper/func.go:16:7: expected '(', found '[' (typecheck) func M[T any](ret T, err error) T { ^ pkg/wrapper/slice.go:22:11: expected '(', found '[' (typecheck) func Limit[T any](s []T, limit int, hidePosition int, manySuffix ...T) []T { ^ testings/testing.go:41:4: t.Log undefined (type *T has no field or method Log) (typecheck) t.Log(args...) ^ make: *** [Makefile.common:198: common-lint] Error 1
Compilation Error

I am facing issues while compiling step-1 : make common-build Success Step-2 : make Error:

make

checking code style gofmt checking failed! diff -u ./pkg/[email protected]/tail_posix.go.orig ./pkg/[email protected]/tail_posix.go --- ./pkg/[email protected]/tail_posix.go.orig 2022-06-28 22:00:04.209118589 +0900 +++ ./pkg/[email protected]/tail_posix.go 2022-06-28 22:00:04.209118589 +0900 @@ -1,3 +1,4 @@ +//go:build linux || darwin || freebsd || netbsd || openbsd // +build linux darwin freebsd netbsd openbsd

package tail diff -u ./pkg/[email protected]/tail_windows.go.orig ./pkg/[email protected]/tail_windows.go --- ./pkg/[email protected]/tail_windows.go.orig 2022-06-28 22:00:04.208118589 +0900 +++ ./pkg/[email protected]/tail_windows.go 2022-06-28 22:00:04.208118589 +0900 @@ -1,3 +1,4 @@ +//go:build windows // +build windows

package tail diff -u ./pkg/[email protected]/winfile/winfile.go.orig ./pkg/[email protected]/winfile/winfile.go --- ./pkg/[email protected]/winfile/winfile.go.orig 2022-06-28 22:00:04.208118589 +0900 +++ ./pkg/[email protected]/winfile/winfile.go 2022-06-28 22:00:04.208118589 +0900 @@ -1,3 +1,4 @@ +//go:build windows // +build windows

package winfile

Please ensure you are using go version go1.18.3 linux/amd64 for formatting code. make: *** [Makefile.common:132: common-style] Error 1

http类型的带header和body的请求，不能正常获得返回值

    datasource:
      - type: "http"
        name: "eth_node"
        relabel_configs: []
        url: "https://bsc-dataseed2.binance.org"
        config:
          body: '{"id":1, "jsonrpc":"2.0", "method": "eth_blockNumber"}' # HTTP请求报文
          headers: { "Content-Type": "application/json"} # 自定义HTTP头
          method: 'POST' #HTTP请求方法 GET/POST/PUT...

像这样的datasource配置，不能获取到json数据

xyr is a very lightweight, simple and powerful data ETL platform that helps you to query available data sources using SQL.

xyr [WIP] xyr is a very lightweight, simple and powerful data ETL platform that helps you to query available data sources using SQL. Supported Drivers

Dec 2, 2022

indodate is a plugin for golang programming language for date convertion on indonesian format

indodate is a package for golang programming language for date conversion on indonesian format

Oct 23, 2021

Stream data into Google BigQuery concurrently using InsertAll() or BQ Storage.

bqwriter A Go package to write data into Google BigQuery concurrently with a high throughput. By default the InsertAll() API is used (REST API under t

Dec 16, 2022

Feed pipe input into a Discord server via webhook.

Oct 28, 2022

Baker is a high performance, composable and extendable data-processing pipeline for the big data era

Baker is a high performance, composable and extendable data-processing pipeline for the big data era. It shines at converting, processing, extracting or storing records (structured data), applying whatever transformation between input and output through easy-to-write filters.

Dec 14, 2022

Dud is a lightweight tool for versioning data alongside source code and building data pipelines.

Dud Website | Install | Getting Started | Source Code Dud is a lightweight tool for versioning data alongside source code and building data pipelines.

Jan 1, 2023

CUE is an open source data constraint language which aims to simplify tasks involving defining and using data.

Jan 1, 2023

DEPRECATED: Data collection and processing made easy.

This project is deprecated. Please see this email for more details. Heka Data Acquisition and Processing Made Easy Heka is a tool for collecting and c

Nov 30, 2022

Open source framework for processing, monitoring, and alerting on time series data

Kapacitor Open source framework for processing, monitoring, and alerting on time series data Installation Kapacitor has two binaries: kapacitor – a CL

Dec 24, 2022

Kanzi is a modern, modular, expendable and efficient lossless data compressor implemented in Go.

kanzi Kanzi is a modern, modular, expendable and efficient lossless data compressor implemented in Go. modern: state-of-the-art algorithms are impleme

Dec 22, 2022

churro is a cloud-native Extract-Transform-Load (ETL) application designed to build, scale, and manage data pipeline applications.

Churro - ETL for Kubernetes churro is a cloud-native Extract-Transform-Load (ETL) application designed to build, scale, and manage data pipeline appli

Mar 10, 2022

Dev Lake is the one-stop solution that integrates, analyzes, and visualizes software development data

Dev Lake is the one-stop solution that integrates, analyzes, and visualizes software development data throughout the software development life cycle (SDLC) for engineering teams.

Dec 30, 2022

A library for performing data pipeline / ETL tasks in Go.

Ratchet A library for performing data pipeline / ETL tasks in Go. The Go programming language's simplicity, execution speed, and concurrency support m

Jan 19, 2022

A distributed, fault-tolerant pipeline for observability data

Table of Contents What Is Veneur? Use Case See Also Status Features Vendor And Backend Agnostic Modern Metrics Format (Or Others!) Global Aggregation

Dec 25, 2022

Data syncing in golang for ClickHouse.

ClickHouse Data Synchromesh Data syncing in golang for ClickHouse. based on go-zero ARCH A typical data warehouse architecture design of data sync Aut

Jan 1, 2023

Machine is a library for creating data workflows.

Machine is a library for creating data workflows. These workflows can be either very concise or quite complex, even allowing for cycles for flows that need retry or self healing mechanisms.

Dec 26, 2022

Fast, efficient, and scalable distributed map/reduce system, DAG execution, in memory or on disk, written in pure Go, runs standalone or distributedly.

Gleam Gleam is a high performance and efficient distributed execution system, and also simple, generic, flexible and easy to customize. Gleam is built

Jan 5, 2023

Glow is an easy-to-use distributed computation system written in Go, similar to Hadoop Map Reduce, Spark, Flink, Storm, etc. I am also working on another similar pure Go system, https://github.com/chrislusf/gleam , which is more flexible and more performant.

glow Purpose Glow is providing a library to easily compute in parallel threads or distributed to clusters of machines. This is written in pure Go. I a

Dec 30, 2022

Gonum is a set of numeric libraries for the Go programming language. It contains libraries for matrices, statistics, optimization, and more

Gonum Installation The core packages of the Gonum suite are written in pure Go with some assembly. Installation is done using go get. go get -u gonum.

Dec 29, 2022

Prometheus Common Data Exporter can parse JSON, XML, yaml or other format data from various sources (such as HTTP response message, local file, TCP response message and UDP response message) into Prometheus metric data.

Prometheus Common Data Exporter

编译

通用

编译Docker镜像

运行

常规启动

调试配置文件

启动examples

使用Docker运行

配置

流程

数据源

file

http

tcp

udp

Labels说明

relabel_configs

Metric匹配语法

数据匹配模式

json

示例

说明

expand

to_entries

yaml

xml

regex

命名分组匹配

Owner

Comments

Restrictinf xml file size

I want to read the xml file whose size is above 600 bytes. so i modifed code as below, but it is not giving expected result. can you please suggest on this. File name : datasource.go

XML data extraction

Container not working

如何动态更改datasource的url地址？

每个metric的值不一定都是数字，希望可以处理其他情况。

How to extract data from multiple data files

Compilation Error

Compilation Error

make

http类型的带header和body的请求，不能正常获得返回值

Related tags

xyr is a very lightweight, simple and powerful data ETL platform that helps you to query available data sources using SQL.

indodate is a plugin for golang programming language for date convertion on indonesian format

Stream data into Google BigQuery concurrently using InsertAll() or BQ Storage.

Feed pipe input into a Discord server via webhook.

Baker is a high performance, composable and extendable data-processing pipeline for the big data era

Dud is a lightweight tool for versioning data alongside source code and building data pipelines.

CUE is an open source data constraint language which aims to simplify tasks involving defining and using data.

DEPRECATED: Data collection and processing made easy.

Open source framework for processing, monitoring, and alerting on time series data

Kanzi is a modern, modular, expendable and efficient lossless data compressor implemented in Go.

churro is a cloud-native Extract-Transform-Load (ETL) application designed to build, scale, and manage data pipeline applications.

Dev Lake is the one-stop solution that integrates, analyzes, and visualizes software development data

A library for performing data pipeline / ETL tasks in Go.

A distributed, fault-tolerant pipeline for observability data

Data syncing in golang for ClickHouse.

Machine is a library for creating data workflows.

Fast, efficient, and scalable distributed map/reduce system, DAG execution, in memory or on disk, written in pure Go, runs standalone or distributedly.

Glow is an easy-to-use distributed computation system written in Go, similar to Hadoop Map Reduce, Spark, Flink, Storm, etc. I am also working on another similar pure Go system, https://github.com/chrislusf/gleam , which is more flexible and more performant.

Gonum is a set of numeric libraries for the Go programming language. It contains libraries for matrices, statistics, optimization, and more