metrics2.0 based, multi-tenant timeseries store for Graphite and friends.

Metrictank logo

Grafana Metrictank

Circle CI Go Report Card GoDoc

Introduction

Grafana Metrictank is a multi-tenant timeseries platform that can be used as a backend or replacement for Graphite. It provides long term storage, high availability, efficient storage, retrieval and processing for large scale environments.

Grafana Labs has been running Metrictank in production since December 2015. It currently requires an external datastore like Cassandra or Bigtable, and we highly recommend using Kafka to support clustering, as well as a clustering manager like Kubernetes. This makes it non-trivial to operate, though Grafana Labs has an on-premise product that makes this process much easier.

Features

  • 100% open source
  • Heavily compressed chunks (inspired by the Facebook gorilla paper) dramatically lower cpu, memory, and storage requirements and get much greater performance out of Cassandra than other solutions.
  • Writeback RAM buffers and chunk caches, serving most data out of memory.
  • Multiple rollup functions can be configured per serie (or group of series). E.g. min/max/sum/count/average, which can be selected at query time via consolidateBy(). So we can do consolidation (combined runtime+archived) accurately and correctly, unlike most other graphite backends like whisper
  • Flexible tenancy: can be used as single tenant or multi tenant. Selected data can be shared across all tenants.
  • Input options: carbon, metrics2.0, kafka.
  • Guards against excessively large queries. (per-request series/points restrictions)
  • Data backfill/import from whisper
  • Speculative Execution means you can use replicas not only for High Availability but also to reduce query latency.
  • Write-Ahead buffer based on Kafka facilitates robust clustering and enables other analytics use cases.
  • Tags and Meta Tags support
  • Render response metadata: performance statistics, series lineage information and rollup indicator visible through Grafana
  • Index pruning (hide inactive/stale series)
  • Timeseries can change resolution (interval) over time, they will be merged seamlessly at read time. No need for any data migrations.

Relation to Graphite

The goal of Metrictank is to provide a more scalable, secure, resource efficient and performant version of Graphite that is backwards compatible, while also adding some novel functionality. (see Features, above)

There's 2 main ways to deploy Metrictank:

  • as a backend for Graphite-web, by setting the CLUSTER_SERVER configuration value.
  • as an alternative to a Graphite stack. This enables most of the additional functionality. Note that Metrictank's API is not quite on par yet with Graphite-web: some less commonly used functions are not implemented natively yet, in which case Metrictank relies on a graphite-web process to handle those requests. See our graphite comparison page for more details.

Limitations

  • No performance/availability isolation between tenants per instance. (only data isolation)
  • Minimum computation locality: we move the data from storage to processing code, which is both metrictank and graphite.
  • Can't overwrite old data. We support reordering the most recent time window but that's it. (unless you restart MT)

Interesting design characteristics (feature or limitation... up to you)

  • Upgrades / process restarts requires running multiple instances (potentially only for the duration of the maintenance) and possibly re-assigning the primary role. Otherwise data loss of current chunks will be incurred. See operations guide
  • clustering works best with an orchestrator like kubernetes. MT itself does not automate master promotions. See clustering for more.
  • Only float64 values. Ints and bools currently stored as floats (works quite well due to the gorilla compression),
  • Only uint32 unix timestamps in second resolution. For higher resolution, consider streaming directly to grafana
  • We distribute data by hashing keys, like many similar systems. This means no data locality (data that will be often used together may not live together)

Docs

installation, configuration and operation.

features in-depth

Other

Releases and versioning

  • releases and changelog

  • we aim to keep master stable and vet code before merging to master

  • We're pre-1.0 but adopt semver for our 0.MAJOR.MINOR format. The rules are simple:

    • MAJOR version for incompatible API or functionality changes
    • MINOR version when you add functionality in a backwards-compatible manner, and

    We don't do patch level releases since minor releases are frequent enough.

License

Copyright 2016-2019 Grafana Labs

This software is distributed under the terms of the GNU Affero General Public License.

Some specific packages have a different license:

Owner
Grafana Labs
Grafana Labs is behind leading open source projects Grafana and Loki, and the creator of the first open & composable observability platform.
Grafana Labs
Comments
  • future of raintank-metric, use something else?

    future of raintank-metric, use something else?

    please help me fill this in. we need to agree on what our requirements/desires are before talking about using other tools

    current requirements?

    • safely relay metrics from our queue into storage and ES without losing data in case we can't safely deliver
    • decode messages from our custom format used in rabbitmq (but i suppose we could also store them differently in rabbit?)
    • encode messages into our custom format, to be stored in ES

    possible future requirements

    • real time aggregation
    • real time processing/alerting (I personally don't think we need to be too concerned about this just yet. once we have high performance/scalability requirements we'll probably use a dedicated real time processing framework like spark/storm/heron/...)

    questions

    • can we write our own decode, encode, processor plugins in Go, in heka?
    • can somebody describe what we do with ES from the raintank-metric/rabbitmq perspective and how dependent this is on the main storage backend? like if kairosdb is down, can we or must we still update ES? if ES is down, can or must we still write to kairos?
    • does rabbitmq support multiple readers of the same data, and does it maintain what has been acked by which reader?
  • rollups.

    rollups.

    the time has come.

    • https://github.com/raintank/ops/issues/112 has some details
    • i know we wrote down some thoughts etc at the summit, do we still have those notes? or perhaps not that important
    • implementation will probably be an nsq consumer that generates all lower-res streams and stores them (including for current data), as opposed to a design where lower-res only starts where higher-res ends.
    • we can generate spread data like librato/omniti/hostedgraphite, or store individual min/max/avg/.. series, or use an algo like LTTB (see https://github.com/sveinn-steinarsson/flot-downsample/)
  • Add asPercent function

    Add asPercent function

    Native implementation of asPercent() Graphite function. (http://graphite.readthedocs.io/en/latest/functions.html#graphite.render.functions.asPercent)

    Added a new argument type ArgIn that allows multiple other argument types. This was necessary for the total argument. Some of the code borrowed from an abandoned PR: https://github.com/grafana/metrictank/pull/672

    In terms of speed improvement:

    ---------- Native Implementation ----------
    Requests      [total, rate]            900, 5.01
    Duration      [total, attack, wait]    3m0.13756s, 2m59.799999s, 337.561ms
    Latencies     [mean, 50, 95, 99, max]  72.006704ms, 38.065ms, 342.887ms, 472.467ms, 765.657ms
    Bytes In      [total, mean]            130300948, 144778.83
    Bytes Out     [total, mean]            0, 0.00
    Success       [ratio]                  100.00%
    Status Codes  [code:count]             200:900
    Error Set:
    ---------- Graphite (Python) Implementation ----------
    Requests      [total, rate]            900, 5.01
    Duration      [total, attack, wait]    3m6.337648s, 2m59.799999s, 6.537649s
    Latencies     [mean, 50, 95, 99, max]  797.282367ms, 167.489ms, 4.789024s, 6.756318s, 8.006429s
    Bytes In      [total, mean]            144407224, 160452.47
    Bytes Out     [total, mean]            0, 0.00
    Success       [ratio]                  100.00%
    Status Codes  [code:count]             200:900
    Error Set:
    

    On average, the native implementation was 11x faster, median was 4x faster, p95 was 14x faster, p99 was 14x faster and max was over 10x faster

  • Optimize for a large number of new metrics getting added while still serving queries fast

    Optimize for a large number of new metrics getting added while still serving queries fast

    We're still experiencing serious issues with instances that have a large index and a high metric churn, in the worst case all queries time out when the index gets slammed with too many adds per time. We should first add a benchmark which adds a large number of metrics to a large index, while concurrently querying it. Then we can try to optimize it based on that benchmark so queries still get served fast, while index adds happen eventually but with lower priority.

  • Use Confluent and move the kafka-consumers into one consumer struct

    Use Confluent and move the kafka-consumers into one consumer struct

    Replaces the sarama consumers with confluent ones. Also gets rid of the duplication between the kafka notifier and kafka input by moving all kafka consumer related stuff into a new struct that's used by both of them.

  • meta-tags (previously known as extrinsic tags)

    meta-tags (previously known as extrinsic tags)

    Metrics 2.0 supports adding meta data to metrics, however this is at the cost of network bandwidth. A lot of meta data could be very static (e.g. the data-center a machine is in). It would be very nice to have a means for bulk-loading / updating static meta-data and having it merge in with tags.

    For example every metric might have a tag host. Associated with the host is a collection of static data, cluster, data-center, os, os-version, etc. We would like to feed this in. From grafana this would appear as tags to the metric.

  • deadlock in SyncChunkSaveState

    deadlock in SyncChunkSaveState

    We had Cassandra slowness which caused the write queues to fill in MT, but we noticed that on many of our instances, one worker never drained. Over the course of a couple minutes this caused all ingest in MT to stop.

    Attaching a snapshot of the dashboard. It's a bit hard to tell, so I added an arrow showing where queue 5 just hangs as the other queues all start to drain. Also attached a stack from one of our hung instances.

    screen shot 2017-09-11 at 5 04 43 pm

    metrictank.20170911.stack.txt

  • Prune index in Cassandra

    Prune index in Cassandra

    We currently keep adding entries to the index in Cassandra and never prune them. At startup MT needs to load all of that data and filter it by the LastUpdated property to ignore the ones that have not been updated for a certain amount of time, but this makes the startup slower and slower because it needs to filter more data. We should delete index entries from Cassandra once they have reached a certain age. That pruning age should probably be higher than when we prune them from the memory index, because we want to keep the ability to just adjust the memory pruning settings and restart MT to restore index entries that have already been pruned from memory. If a user decides to send a metric again, and hence "activates" it again in the cassandra/memory indices, the historic data will still be available just like it is now.

    The simplest solution for that would probably be a simple go routine that occasionally loads all the data from the cassandra index and deletes all the entries that haven't been updated for a certain time.

  • Add support for `summarize`

    Add support for `summarize`

    There's a question to be answered before this is ready to merge. Should the input series have a matching QueryPatt and Target? Is QueryFrom and QueryTo equivalent to series.start and series.end in the python code?

    Also if #833 is merged, there'll be a conflict with the docs/graphite and I'll rebase to squash commits

  • reorderBuffer question

    reorderBuffer question

    Hi,

    Can you ellaborate on how to configure the reorderBuffer? What should be the relation between that number and the raw interval specified in the first defined retention.

    From my tests it seems reorder work but the extra data it reorder is gone every "gc-interval".

  • Add tags to exported Metrictank stats

    Add tags to exported Metrictank stats

    We need to add a duplicate set of exported stats to Metrictank in order to facilitate the transition to a mostly tag based system.

    For a yet to be determined amount of time we should export both the current stats and the new stats with tags to allow alerts and queries to be updated accordingly. This will temporarily increase memory usage.

    This will also be a great opportunity to gather data on the memory usage reductions provided by #1212

    Examples of proposed tagged stats:

    memory.gc.cpu_fraction

    Old Stat

    metrictank.stats.$environment.$instance.memory.gc.cpu_fraction.gauge32
    

    New Stat

    memory.gc.cpu-fraction;application=metrictank;environment=$environment;instance-id=$instance;metric-type=gauge32
    

    api.request.node.latency

    Old Stat

    metrictank.stats.$environment.$instance.api.request.node.latency.mean.gauge32
    

    New Stat

    api.request.latency;application=metrictank;environment=$environment;instance-id=$instance;http-path=node;metric-aggregation=mean;metric-type=gauge32
    

    idx.memory.find-cache.invalidation.drop

    Old Stat

    metrictank.stats.$environment.$instance.idx.memory.find-cache.invalidation.drop
    

    New Stat

    idx.memory.find-cache.invalidation;application=metrictank;environment=$environment;instance-id=$instance;metric=drop;metric-aggregation=mean;metric-type=gauge32
    

    idx.memory.find-cache.invalidation.exec

    Old Stat

    metrictank.stats.$environment.$instance.idx.memory.find-cache.invalidation.exec
    

    New Stat

    idx.memory.find-cache.invalidation;application=metrictank;environment=$environment;instance-id=$instance;metric=exec;metric-aggregation=mean;metric-type=gauge32
    

    You might be wondering why I am proposing to use such long tag keys and values. The length of the tag key and value don't matter much when using #1212. All of the terms will be stored once as byte slices in the object store and many times as uintptrs in Metrictank itself, so being more verbose is not an issue.

    Each series will need to be inspected individually to determine the exact series name / tag combinations.

  • Refactor metrictank repo to be compatible with go modules

    Refactor metrictank repo to be compatible with go modules

    This PR refactors the entire metrictank repository to make it compatible with Go modules as well with vscode's static compilation. All of the Makefile commands have been fixed (as far as I know). I believe travis/circleci is also working.

    As part of the change, the vendor/ directory has been removed from checked-in source, instead users should run go mod vendor locally as needed.

  • Control partition size using Cassandra as backend

    Control partition size using Cassandra as backend

    Hello guys

    How can I control the partition size for the tables in Cassandra? I have 3GB of partition size in the table metrictank.metric_idx and it is obviously hitting the performance

  • Add tool for reporting out of order and duplicate metrics

    Add tool for reporting out of order and duplicate metrics

    This PR adds a new tool, cmd/mt-kafka-mdm-report-out-of-order, which consumes metrics from Kafka and discovers those which are out of order or duplicates. It then groups these metrics by name or a specific tag using an index built from Cassandra, and outputs the results.

  • Panic in function processing (seriesaggregators.go)

    Panic in function processing (seriesaggregators.go)

    In one of our production instances we're seeing a panic occurring regularly:

    [Macaron] PANIC: runtime error: index out of range [179] with length 179
    /usr/local/go/src/runtime/panic.go:88 (0x434fa4)
    /go/src/github.com/grafana/metrictank/expr/seriesaggregators.go:208 (0xc0b96c)
    /go/src/github.com/grafana/metrictank/expr/func_aggregate.go:73 (0xbdcb8b)
    /go/src/github.com/grafana/metrictank/expr/func_aggregate.go:60 (0xbdc714)
    /go/src/github.com/grafana/metrictank/expr/plan.go:327 (0xc0a6a8)
    /go/src/github.com/grafana/metrictank/api/graphite.go:1016 (0xc852d6)
    /go/src/github.com/grafana/metrictank/api/graphite.go:318 (0xc7d424)
    /usr/local/go/src/reflect/value.go:475 (0x4c11a6)
    /usr/local/go/src/reflect/value.go:336 (0x4c0698)
    /go/src/github.com/grafana/metrictank/vendor/github.com/go-macaron/inject/inject.go:177 (0xb01439)
    /go/src/github.com/grafana/metrictank/vendor/github.com/go-macaron/inject/inject.go:137 (0xb00e0a)
    /go/src/github.com/grafana/metrictank/vendor/gopkg.in/macaron.v1/context.go:121 (0xb1bc1c)
    /go/src/github.com/grafana/metrictank/vendor/gopkg.in/macaron.v1/context.go:112 (0xc6a124)
    /go/src/github.com/grafana/metrictank/vendor/github.com/raintank/gziper/gzip.go:100 (0xc6a117)
    /go/src/github.com/grafana/metrictank/vendor/gopkg.in/macaron.v1/context.go:79 (0xb1ba92)
    /go/src/github.com/grafana/metrictank/vendor/github.com/go-macaron/inject/inject.go:157 (0xb01154)
    /go/src/github.com/grafana/metrictank/vendor/github.com/go-macaron/inject/inject.go:135 (0xb00ef9)
    /go/src/github.com/grafana/metrictank/vendor/gopkg.in/macaron.v1/context.go:121 (0xb1bc1c)
    /go/src/github.com/grafana/metrictank/vendor/gopkg.in/macaron.v1/context.go:112 (0xb2cda5)
    /go/src/github.com/grafana/metrictank/vendor/gopkg.in/macaron.v1/recovery.go:161 (0xb2cd98)
    /go/src/github.com/grafana/metrictank/vendor/gopkg.in/macaron.v1/logger.go:40 (0xb1f7b7)
    /go/src/github.com/grafana/metrictank/vendor/github.com/go-macaron/inject/inject.go:157 (0xb01154)
    /go/src/github.com/grafana/metrictank/vendor/github.com/go-macaron/inject/inject.go:135 (0xb00ef9)
    /go/src/github.com/grafana/metrictank/vendor/gopkg.in/macaron.v1/context.go:121 (0xb1bc1c)
    /go/src/github.com/grafana/metrictank/vendor/gopkg.in/macaron.v1/context.go:112 (0xb83d54)
    /go/src/github.com/grafana/metrictank/api/middleware/logger.go:45 (0xb83d3d)
    /usr/local/go/src/reflect/value.go:475 (0x4c11a6)
    /usr/local/go/src/reflect/value.go:336 (0x4c0698)
    

    I think this is an indication for a bigger issue, because actually this "index out of range" shouldn't happen, but as a quick fix we could at least add a len() check on the relevant line to prevent a panic.

  • "limit exhausted" message due to max-series-per-req limit (for untagged requests)

    the new max-series limit for untagged requests (see #1926 , #1929 ), specifically the limit fanned out to cluster peers (which is proportional to how much data the node has), has an issue. because metrictank has the limit enabled by default, this means that deploying master can return in "limit exhausted" messages well before the number of series is actually hit.

    Here's why: first of all note that UnpartitionedMemoryIdx.Find takes in the limit parameter which is this proportional limit. (if PartitionedMemoryIdx.Find is used it further divides the limit by the number of partitions before it calls each UnpartitionedMemoryIdx.Find). This Find method relies on UnpartitionedMemoryIdx.findMaybeCached to use the find helper with the find cache in front of it. Note also that Find does "from filtering", meaning e.g. for a query from now-1h to now, metric definitions with lastUpdate < from are known to have no data and are not included in the result set.

    But:

    • find() does a complete find (without 'from' filtering) these results can nicely be cached. If we from-filtered the results we'd need to add them to cache key and probably cache similar data many times over because on repeated queries the from would typically vary constantly. (but on the other hand, if there are typically many definitions that haven't been updated in a while - e.g. heavy churn situations - this approach means our cached results contain a large amount of definitions we don't need)
    • In Find() is where we do 'from' filtering.

    We implemented the "series limiter" in find() because we don't want to first assemble the entire result set only to then check the limit, as that would make the limit moot, effectively. But implementing the limiter prior to from-filtering makes it impossible to apply correctly, as typically most entries will not be included in the final resultset due to the 'from' filtering. This means a benign request that only asks for a reasonable amount of (from filtered) series, may hit the limiter if there's more entries with older lastUpdate timestamps.

    There's actually a 2nd kind of bug, which seems more rare, but anyway. Because find() does a breath-first search, as it's progressing down the tree it may collect a lot of branches that match the expression so far. We have to keep the branches while traversing further down the tree (and the pattern) to see whether the branches (and ultimately the leaves) fully match the path. We precisely want to want to avoid only applying the limit until we've assembled the full response, which means we currently trigger a breach condition when the amount of "candidate branches" exceeds the limit, but it may well be that those branches would be dropped.

    We want to avoid loading too much data in RAM (apply limit as early possible), while caching the find response body in a way that's agonstic wrt the 'from' filter, and in a correct way.

    My proposal:

    1. change the find() algorithm to be depth first rather than breadth first. This means we can apply the limit as early possible (without loading too much data in RAM), in a correct way, although the limit needs to be moved outside of the main find() algorithm, so it can be applied on the from-filtered output. (e.g. in Find)
    2. rather than (assuming the uncached scenario) first finishing the complete find() - which may result in too much data -before doing the from filtering and applying the limit, change the calling convention to be a lockstep approach with an iterator.

    find() becomes an iterator which feeds data to its caller (Find) which can do the from filtering and apply the limit while 'find' is running. the new find() can keep a full copy of its unfiltered output, so that - assuming the iteration is not aborted by its caller - it can add the full output to the cache after the iteration completes.

    To make the find iteration depth first rather than breadth first. we compute all matchers up front (we wouldn't want to recompute them everytime we revisit a certain level of the tree) this implies:

    • con: we compute all matchers up front, in particular we may compute matchers needlessly if we would otherwise have found out "early" that a query had no matches
    • pro: bail out early if a match expression is malformed. no need to traverse a section of the tree before detecting a bad matcher.
k6 query generator for graphite API

xk6-carbonapi This is a k6 extension using the xk6 system. ❗ This is a proof of concept, isn't supported by the k6 team, and may break in the future.

Dec 1, 2022
Open, Multi-Cloud, Multi-Cluster Kubernetes Orchestration
Open, Multi-Cloud, Multi-Cluster Kubernetes Orchestration

Karmada Karmada: Open, Multi-Cloud, Multi-Cluster Kubernetes Orchestration Karmada (Kubernetes Armada) is a Kubernetes management system that enables

Dec 30, 2022
Go WhatsApp Multi-Device Implementation in REST API with Multi-Session/Account Support

Go WhatsApp Multi-Device Implementation in REST API This repository contains example of implementation go.mau.fi/whatsmeow package with Multi-Session/

Dec 3, 2022
Hexa is the open-source, standards-based policy orchestration software for multi-cloud and hybrid businesses.

Hexa Policy Orchestrator Hexa is the open-source, standards-based policy orchestration software for multi-cloud and hybrid businesses. The Hexa projec

Dec 22, 2022
K8s controller implementing Multi-Cluster Services API based on AWS Cloud Map.

AWS Cloud Map MCS Controller for K8s Introduction AWS Cloud Map multi-cluster service discovery for Kubernetes (K8s) is a controller that implements e

Dec 17, 2022
Kilo is a multi-cloud network overlay built on WireGuard and designed for Kubernetes (k8s + wg = kg)

Kilo Kilo is a multi-cloud network overlay built on WireGuard and designed for Kubernetes. Overview Kilo connects nodes in a cluster by providing an e

Jan 1, 2023
Enterprise-grade container platform tailored for multicloud and multi-cluster management
Enterprise-grade container platform tailored for multicloud and multi-cluster management

KubeSphere Container Platform What is KubeSphere English | 中文 KubeSphere is a distributed operating system providing cloud native stack with Kubernete

Jan 2, 2023
⎈ Multi pod and container log tailing for Kubernetes

stern Stern allows you to tail multiple pods on Kubernetes and multiple containers within the pod. Each result is color coded for quicker debugging. T

Nov 7, 2022
Enable dynamic and seamless Kubernetes multi-cluster topologies

Enable dynamic and seamless Kubernetes multi-cluster topologies Explore the docs » View Demo · Report Bug · Request Feature About the project Liqo is

Dec 30, 2022
The example shows how to build a simple multi-tier web application using Kubernetes and Docker
The example shows how to build a simple multi-tier web application using Kubernetes and Docker

Guestbook Example This example shows how to build a simple multi-tier web application using Kubernetes and Docker. The application consists of a web f

Nov 15, 2021
Multi cluster kubernetes dashboard with batteries included. Build by developers, for developers.

kubetower Multi cluster kubernetes dashboard with batteries included. Built by developers, for developers. Features Restart deployments with one click

Nov 28, 2022
A multi-service dev environment for teams on Kubernetes
A multi-service dev environment for teams on Kubernetes

Tilt Kubernetes for Prod, Tilt for Dev Modern apps are made of too many services. They're everywhere and in constant communication. Tilt powers multi-

Jan 5, 2023
Sample multi docker compose environment setup

Instructions This is a demonstration of a Multi Docker Compose. The purpose of this repositoy is ongoing research on "Docker compose" architecture des

Oct 21, 2022
CoreDNS plugin implementing K8s multi-cluster services DNS spec.

corends-multicluster Name multicluster - implementation of Multicluster DNS Description This plugin implements the Kubernetes DNS-Based Multicluster S

Dec 3, 2022
A simple multi-layered config loader for Go. Made for smaller projects. No external dependencies.

gocfg ⚠️ Work in progress! A simple multi-layered config loader for Go. Made for smaller projects. No external dependencies. Example From main.go: //

Dec 26, 2021
Golang multi lingual web application development

MultiLingualWeb golang multi lingual web application development Command execute

Dec 31, 2021
Taller explicativo sobre construcciones multi-stage de Docker

TALLER: DOCKER MULTI-STAGE Y SUS BENEFICIOS PRACTICOS MULTI-STAGE Funcionalidad de Docker que permite definir multiples imagenes intermedias de Docker

Dec 28, 2021
Docker go multi stage builds

Сборка go mod download go build Запуск ./go-sample-app Примеры использования: ht

Jan 12, 2022
An Oracle Cloud (OCI) Pulumi resource package, providing multi-language access to OCI

Oracle Cloud Infrastructure Resource Provider The Oracle Cloud Infrastructure (OCI) Resource Provider lets you manage OCI resources. Installing This p

Dec 2, 2022