Open Source HTTP Reverse Proxy Cache and Time Series Dashboard Accelerator

     Follow on Twitter

License Coverage Status build Status Go Report Card CII Best Practices GoDoc Docker Pulls

Trickster is an HTTP reverse proxy/cache for http applications and a dashboard query accelerator for time series databases.

Learn more below, and check out our roadmap to find out what else is in the works.

Trickster is hosted by the Cloud Native Computing Foundation (CNCF) as a sandbox level project. If you are a company that wants to help shape the evolution of technologies that are container-packaged, dynamically-scheduled and microservices-oriented, consider joining the CNCF.

Note: Trickster v1.1 is the production release, sourced from the v1.1.x branch. The main branch sources Trickster 2.0, which is currently in beta.

HTTP Reverse Proxy Cache

Trickster is a fully-featured HTTP Reverse Proxy Cache for HTTP applications like static file servers and web API's.

Proxy Feature Highlights

Time Series Database Accelerator

Trickster dramatically improves dashboard chart rendering times for end users by eliminating redundant computations on the TSDBs it fronts. In short, Trickster makes read-heavy Dashboard/TSDB environments, as well as those with highly-cardinalized datasets, significantly more performant and scalable.

Compatibility

Trickster works with virtually any Dashboard application that makes queries to any of these TSDB's:

Prometheus

ClickHouse

InfluxDB

Circonus IRONdb

See the Supported TSDB Providers document for full details

How Trickster Accelerates Time Series

1. Time Series Delta Proxy Cache

Most dashboards request from a time series database the entire time range of data they wish to present, every time a user's dashboard loads, as well as on every auto-refresh. Trickster's Delta Proxy inspects the time range of a client query to determine what data points are already cached, and requests from the tsdb only the data points still needed to service the client request. This results in dramatically faster chart load times for everyone, since the tsdb is queried only for tiny incremental changes on each dashboard load, rather than several hundred data points of duplicative data.

2. Step Boundary Normalization

When Trickster requests data from a tsdb, it adjusts the clients's requested time range slightly to ensure that all data points returned are aligned to normalized step boundaries. For example, if the step is 300s, all data points will fall on the clock 0's and 5's. This ensures that the data is highly cacheable, is conveyed visually to users in a more familiar way, and that all dashboard users see identical data on their screens.

3. Fast Forward

Trickster's Fast Forward feature ensures that even with step boundary normalization, real-time graphs still always show the most recent data, regardless of how far away the next step boundary is. For example, if your chart step is 300s, and the time is currently 1:21p, you would normally be waiting another four minutes for a new data point at 1:25p. Trickster will break the step interval for the most recent data point and always include it in the response to clients requesting real-time data.

Trying Out Trickster

Check out our end-to-end Docker Compose demo composition for a zero-configuration running environment.

Installing

Docker

Docker images are available on Docker Hub:

$ docker run --name trickster -d -v /path/to/trickster.yaml:/etc/trickster/trickster.yaml -p 0.0.0.0:8480:8480 trickstercache/trickster

See the 'deploy' Directory for more information about using or creating Trickster docker images.

Kubernetes

See the 'deploy' Directory for Kube and deployment files and examples.

Helm

Trickster Helm Charts are located at https://helm.tricksterproxy.io for installation, and maintained at https://github.com/trickstercache/helm-charts. We welcome chart contributions.

Building from source

To build Trickster from the source code yourself you need to have a working Go environment with version 1.17 or greater installed.

You can directly use the go tool to download and install the trickster binary into your GOPATH:

    $ go get github.com/trickstercache/trickster/cmd/trickster
    # this starts a prometheus accelerator proxy for the provided endpoint
    $ trickster -origin-url http://prometheus.example.com:9090 -provider prometheus

You can also clone the repository yourself and build using make:

    $ mkdir -p $GOPATH/src/github.com/trickstercache
    $ cd $GOPATH/src/github.com/trickstercache
    $ git clone https://github.com/trickstercache/trickster.git
    $ cd trickster
    $ make build
    $ ./OPATH/trickster -origin-url http://prometheus.example.com:9090 -provider prometheus

The Makefile provides several targets, including:

  • build: build the trickster binary
  • docker: build a docker container for the current HEAD
  • clean: delete previously-built binaries and object files
  • test: runs unit tests
  • bench: runs benchmark tests
  • rpm: builds a Trickster RPM

More information

  • Refer to the docs directory for additional info.

Contributing

Refer to CONTRIBUTING.md

Who Is Using Trickster

As the Trickster community grows, we'd like to keep track of who is using it in their stack. We invite you to submit a PR with your company name and @githubhandle to be included on the list.

  1. Comcast [@jranson]
  2. Selfnet e.V. [@ThoreKr]
  3. swarmstack [@mh720]
  4. Hostinger [@ton31337]
  5. The Remote Company (MailerLite, MailerSend, MailerCheck, YCode) [@aorfanos]

© 2021 The Linux Foundation. All rights reserved. The Linux Foundation has registered trademarks and uses trademarks. For a list of trademarks of The Linux Foundation, please see our Trademark Usage page.

Comments
  • Trickster provokes grafana display quirks

    Trickster provokes grafana display quirks

    I noticed a few grafana display bugs in my graphs which at first I blamed on Thanos but forgot I had a trickster in front of it.

    Expected graph grafana-prometheus

    Thanos + Trickster graph grafana-thanos-2

    The data are the same but only come from different origin/path.

    This only happens when I set the end of range to now in grafana. If fix the end of range to a fixed time I can't reproduce.

    Taking a look at the data coming from Prometheus and Thanos + Trickster I got the following diff

    screenshot_15

    As you can see the last timestamp is not aligned with the step in the Thanos + Trickster case and it confuses grafana's output.

    Bypassing Trickster fixes the problem.

  • Unable to handle scalar responses

    Unable to handle scalar responses

    If I send a simple scalar query such as /api/v1/query?query=5 trickster now returns errors. I bisected this and it seems that the error was introduced in ec4eff34d5532b1907723eeaabe620f02dd25b32 . The basic problem here is that trickster assumes the response type from prometheus as a vector, when in reality it can be (1) scalar (2) vector or (3) string. The way this worked before was that the unmarshal error was ignored and the response handed back (I have a patch that puts it back to that behavior.

    From what I see that looks like caching won't work on those scalar values -- as it wasn't able to unmarshal. Alternatively we could change the model to use model.Value and then add some type-switching. But since that is a potentially large change I figured I'd get others' input first.

  • Redis configuration is not caching any request

    Redis configuration is not caching any request

    After configuring Trcikster to use AWS ElastiCache/Redis 5.0.4, deployed as Master + 2 Replicas and without cluster mode enabled.

    Using Trickster 1.0-beta8 image

    Deployed on K8s with 3 replicas and exposing Trickester through an Ingress to allow Grafana to use it as Datasource.

    Every time Grafana runs a query Trickster misses the cache, even copying the request from Grafana and running it with curl multiple times it keeps failing the cache

        [caches]
    
            [caches.default]
            # cache_type defines what kind of cache Trickster uses
            # options are 'bbolt', 'filesystem', 'memory', 'redis' and 'redis_cluster'
            # The default is 'memory'.
            type = 'redis'
    
            # compression determines whether the cache should be compressed. default is true
            # changing the compression setting will leave orphans in your cache for the duration of timeseries_ttl_secs
            compression = true
    
            # timeseries_ttl_secs defines the relative expiration of cached timeseries. default is 6 hours (21600 seconds)
            timeseries_ttl_secs = 21600
    
            # fastforward_ttl_secs defines the relative expiration of cached fast forward data. default is 15s
            fastforward_ttl_secs = 15
    
            # object_ttl_secs defines the relative expiration of generically cached (non-timeseries) objects. default is 30s
            object_ttl_secs = 30
    
                ### Configuration options for the Cache Index
                # The Cache Index handles key management and retention for bbolt, filesystem and memory
                # Redis handles those functions natively and does not use the Trickster's Cache Index
                [caches.default.index]
    
                # reap_interval_secs defines how long the Cache Index reaper sleeps between reap cycles. Default is 3 (3s)
                reap_interval_secs = 3
    
                # flush_interval_secs sets how often the Cache Index saves its metadata to the cache from application memory. Default is 5 (5s)
                flush_interval_secs = 5
    
                # max_size_bytes indicates how large the cache can grow in bytes before the Index evicts least-recently-accessed items. default is 512MB
                max_size_bytes = 536870912
    
                # max_size_backoff_bytes indicates how far below max_size_bytes the cache size must be to complete a byte-size-based eviction exercise. default is 16MB
                max_size_backoff_bytes = 16777216
    
                # max_size_objects indicates how large the cache can grow in objects before the Index evicts least-recently-accessed items. default is 0 (infinite)
                max_size_objects = 0
    
                # max_size_backoff_objects indicates how far under max_size_objects the cache size must be to complete object-size-based eviction exercise. default is 100
                max_size_backoff_objects = 100
    
    
    
                ### Configuration options when using a Redis Cache
                [caches.default.redis]
                # protocol defines the protocol for connecting to redis ('unix' or 'tcp') 'tcp' is default
                protocol = 'tcp'
                # endpoint defines the fqdn+port or path to a unix socket file for connecting to redis
                # default is 'redis:6379'
                endpoint = 'redis-common.external-service.svc.cluster.local:6379'
                # password provides the redis password
                # default is empty
                password = ''
    
  • Ability to cache older-but-frequently-accessed data

    Ability to cache older-but-frequently-accessed data

    Setting value_retention_factor = 1536 to conf file has no impact. Only responses that have less than default 1024 sample are fully cached. Config file is correctly used because I can switch to filesystem caching.

    This is issue when Grafana Prometheus datasource use Resolution 1/1. It result small possible range_query "step", sample per pixel. For example querying 24h data the step is set to 60s --> 1441 sample returned and only 1024 are cached resulting always kmiss and phit.

    Workaround: When Grafana resolution is set to 1/2 --> step size increase to 120s --> 721 sample returned and Trickster has 100% cache hit rate.

  • Panic with v0.1.3

    Panic with v0.1.3

    Hi,

    I installed v0.1.3 in order to see if it fixes https://github.com/Comcast/trickster/issues/92 but I encountered this.

    panic: runtime error: index out of range
    goroutine 493 [running]:
    main.(*PrometheusMatrixEnvelope).cropToRange(0xc0001c06f8, 0x5c06535c, 0x0)
    	/go/src/github.com/Comcast/trickster/handlers.go:1014 +0x4ee
    main.(*TricksterHandler).originRangeProxyHandler(0xc000062140, 0xc000312640, 0x41, 0xc00014c8a0)
    	/go/src/github.com/Comcast/trickster/handlers.go:840 +0x615
    created by main.(*TricksterHandler).queueRangeProxyRequest
    	/go/src/github.com/Comcast/trickster/handlers.go:660 +0x276
    
  • [Question] Use with Prometheus High Availability?

    [Question] Use with Prometheus High Availability?

    I'm super excited about this project! Thanks for sharing it with the community!

    I had a question about this part of the docs:

    In a Multi-Origin placement, you have one dashboard endpoint, one Trickster endpoint, and multiple Prometheus endpoints. Trickster is aware of each Prometheus endpoint and treats them as unique databases to which it proxies and caches data independently of each other.

    Could this work for load balancing multiple Prometheus servers in a HA setup? We currently have a pair of Prometheus servers in each region, redundantly scraping the same targets. Currently our Grafana is just pinned to one Prometheus server in each region, meaning that if that one goes down our dashboards go down until we manually change the datasource to point to the other one (and by that point we would have just restored the first server anyway). It's kind of a bummer because it means that while HA works great for alerting itself, it doesn't work for dashboards.

    Would be awesome if there was a way to achieve this with Trickster!

  • issues with helm deployment

    issues with helm deployment

    I faced an issue yesterday with Chart 1.1.2, it seems the 1.0-beta chaged to be copy of 1.0-beta10 which caused trickster to fail to start after updating my kubernetes cluster nodes and pulling the docker image. then I tried to update to latest chart.

    for 1.3.0 also some values in values.yaml is duplicated like service section which cause an issue with config file having an empty listen_port

    also PVC contains deprecated values for example https://github.com/Comcast/trickster/blob/6ac009f29aeeea9476da9db6311d0aa7cf39033c/deploy/helm/trickster/templates/pvc.yaml#L1 there is no section called config in current values.yaml

    workaround: used tag 1.0.9-beta9

  • High memory usage with redis cache backend

    High memory usage with redis cache backend

    We're using the redis cache backend in our instance of trickster and we're seeing suprisingly high memory usage.

    We are running trickster in kubernetes, with memory limited to 4GB. Trickster only needs to be up for half an hour before it's killed for using more than 4GB memory (OOM).

    Has anyone seen any similar behavior? Any idea what could be going on? We've tested increasing the memory limits, they were originally set to 2GB, but the problem persists.

  • 1.1 performance regression

    1.1 performance regression

    It seems like when we load test trickster with a prometheus backend, that 1.1 has a reasonably large performance regression as compared to 1.0

    both 1.1.2 and 1.1.3 seems to max out at around 100 requests per second when we load test it. 1.0 doesn't seem to get throttled in the same way and can go to several hundred (haven't tried higher yet). We also see much higher cpu and memory usage for 1.1.

    We're trying about 15 different queries set to query over the last hour.

  • alignStepBoundaries edge cases

    alignStepBoundaries edge cases

    This method is not very large, but does a few things which IMO are questionable.

    1. reversing start/end if the user flipped start/end, it is not the responsibility of trickster to flip it. This is additionally confusing as things will work, but if trickster is removed from the call path all queries this fixes will break. IMO it is not the place of the cache to "correct" user queries.

    2. time.Now() checks I think its fine to cut off queries from going into the future, but this doesn't handle the case where both start/end are in the future. IMO in that case this should return an error. In the remaining cases I think its fine to leave it truncating end as the query results will remain unaffected, although I'd rather it didn't (the time.Now() constraint is a cache issue, not a query issue).

    3. default step param If the user didn't define one, this is not the place of a caching proxy to correct it. Here we should just return an error and make the user correct their query.

  • Add warning for potential Redis misconfigurations

    Add warning for potential Redis misconfigurations

    Hi - I've set trickster to cache to redis, I'm using beta 8 (but have tried multiple versions). Trickster seems to connect fine and even tries to store data:

    -- time=2019-06-19T12:56:57.101537228Z app=trickster caller=proxy/engines/cache.go:71 level=debug event="compressing cached data" cacheKey=thanos-query:9090.0a9332e4c9046613a62ba8a6e4a2e78a.sz time=2019-06-19T12:56:57.101899271Z app=trickster caller=cache/redis/redis.go:82 level=debug event="redis cache store" key=thanos-query:9090.0a9332e4c9046613a62ba8a6e4a2e78a.sz

    However i get no keys in redis and the dashboard(s) load no quicker - I have tried Trickster with the in-memory option and this works as expected.

    I am also able to write to redis both using the CLI and an external test application just to rule redis out.

    I've also tried standing up multiple Redis types (e.g standard, cluster and sentinel)

    Thanks!

  • Logger not flushed before exitFatal()

    Logger not flushed before exitFatal()

    https://github.com/trickstercache/trickster/blob/2eeb4ba048ed1676105cf954c849a39278ce38cc/pkg/proxy/listener/listener.go#L185-L189

    This calls f() to exitFatal() but the log message does not show on the terminal as the loggers are not flushed. I wasted an hour trying to understand why the program won't start. Turns out I had a port already bind issue from a previous run but the log message won't print and since exitFatal() does not show stack trace there is no way to know what happened.

  • Frontend and Backend should be able to handle tls independently

    Frontend and Backend should be able to handle tls independently

    https://github.com/trickstercache/trickster/blob/main/pkg/proxy/tls/options/options.go#L79-L109

    This validation check requires TLS is used for both frontend and backend. It should be possible to only use tls for backend and not use tls for frontend.

  • Request Status Always

    Request Status Always "proxy-only" and General Improvements

    Trickster Version

    Trickster version: 1.1.5, buildInfo: 2022-11-10T14:56:50+0000 , goVersion: go1.17.12, copyright: © 2018 Comcast Corporation

    Problem

    Every request made to an influxdb origin is returning X-Trickster-Result: engine=HTTPProxy; status=proxy-only.

    Below is my configuration (left) and 2 contiguous attempts at querying trickster (right): proxy-only

    Is there something I am doing incorrectly in my configuration?

    On that note - I've been trying to get Trickster to work for ClickHouse, Prometheus, and InfluxDB, and have only gotten it to work with Prometheus. Are there plans to maintain this project more actively? I've been told that the ClickHouse plugin does not work, and in general it would be great if the logging was improved to explain why certain states like proxy-only occur.

    My company is at a point where we're considering new tools to help load balancing and caching toward ClickHouse, Prometheus, and Influx. Trickster appears to perfectly satisfy of our requirements based on what I've read, and it would be incredibly useful to for our use case. Thank you for your work so far!

  • Non ts cache

    Non ts cache

    Adds the capability from IronDB to cache SELECT requests in the short term to InfluxDB. Support for InfluxDB 2.0 and Flux incoming as part of these changes; draft to keep changes public & up to date, since this is one of the roadmap items.

  • Purge from cache by key, or by path on local admin router

    Purge from cache by key, or by path on local admin router

    Quick notes on changes:

    • Purge by key uses Gorilla mux with path matching, ex /trickster/purge/key/{backend}/{key}. Purge by path uses http.ServeMux with query parameters, where the path is URL encoded, ex `/trickster/purge/path?backend={backend}&path={path}. This is to help manage URL encoded items, and to avoid altering too much of the existing admin router code.
    • Purging by path uses a recreation of cache key derivation for requests, calculating an MD5 sum of the passed path.
The Prometheus monitoring system and time series database.

Prometheus Visit prometheus.io for the full documentation, examples and guides. Prometheus, a Cloud Native Computing Foundation project, is a systems

Dec 31, 2022
VictoriaMetrics: fast, cost-effective monitoring solution and time series database
VictoriaMetrics: fast, cost-effective monitoring solution and time series database

VictoriaMetrics VictoriaMetrics is a fast, cost-effective and scalable monitoring solution and time series database. It is available in binary release

Jan 8, 2023
TalariaDB is a distributed, highly available, and low latency time-series database for Presto
TalariaDB is a distributed, highly available, and low latency time-series database for Presto

TalariaDB is a distributed, highly available, and low latency time-series database that stores real-time data. It's built on top of Badger DB.

Nov 16, 2022
Export output from pg_stat_activity and pg_stat_statements from Postgres into a time-series database that supports the Influx Line Protocol (ILP).

pgstat2ilp pgstat2ilp is a command-line program for exporting output from pg_stat_activity and pg_stat_statements (if the extension is installed/enabl

Dec 15, 2021
Time Series Database based on Cassandra with Prometheus remote read/write support

SquirrelDB SquirrelDB is a scalable high-available timeseries database (TSDB) compatible with Prometheus remote storage. SquirrelDB store data in Cass

Oct 20, 2022
🤔 A minimize Time Series Database, written from scratch as a learning project.
🤔 A minimize Time Series Database, written from scratch as a learning project.

mandodb ?? A minimize Time Series Database, written from scratch as a learning project. 时序数据库(TSDB: Time Series Database)大多数时候都是为了满足监控场景的需求,这里先介绍两个概念:

Jan 3, 2023
:handbag: Cache arbitrary data with an expiration time.

cache Cache arbitrary data with an expiration time. Features 0 dependencies About 100 lines of code 100% test coverage Usage // New cache c := cache.N

Jan 5, 2023
CockroachDB - the open source, cloud-native distributed SQL database.
CockroachDB - the open source, cloud-native distributed SQL database.

CockroachDB is a cloud-native SQL database for building global, scalable cloud services that survive disasters. What is CockroachDB? Docs Quickstart C

Jan 2, 2023
TiDB is an open source distributed HTAP database compatible with the MySQL protocol
TiDB is an open source distributed HTAP database compatible with the MySQL protocol

Slack Channel Twitter: @PingCAP Reddit Mailing list: lists.tidb.io For support, please contact PingCAP What is TiDB? TiDB ("Ti" stands for Titanium) i

Jan 9, 2023
An open-source graph database
An open-source graph database

Cayley is an open-source database for Linked Data. It is inspired by the graph database behind Google's Knowledge Graph (formerly Freebase). Documenta

Dec 31, 2022
RadonDB is an open source, cloud-native MySQL database for building global, scalable cloud services

OverView RadonDB is an open source, Cloud-native MySQL database for unlimited scalability and performance. What is RadonDB? RadonDB is a cloud-native

Dec 31, 2022
Set out to become the de facto open-source alternative to MongoDB

MangoDB MangoDB is set out to become the de facto open-source alternative to MongoDB. MangoDB is an open-source proxy, which converts MongoDB wire pro

Dec 29, 2022
groupcache is a caching and cache-filling library, intended as a replacement for memcached in many cases.

groupcache Summary groupcache is a distributed caching and cache-filling library, intended as a replacement for a pool of memcached nodes in many case

Dec 29, 2022
Distributed cache and in-memory key/value data store.

Distributed cache and in-memory key/value data store. It can be used both as an embedded Go library and as a language-independent service.

Dec 30, 2022
Eventually consistent distributed in-memory cache Go library

bcache A Go Library to create distributed in-memory cache inside your app. Features LRU cache with configurable maximum keys Eventual Consistency sync

Dec 2, 2022
Efficient cache for gigabytes of data written in Go.

BigCache Fast, concurrent, evicting in-memory cache written to keep big number of entries without impact on performance. BigCache keeps entries on hea

Jan 4, 2023
Fast thread-safe inmemory cache for big number of entries in Go. Minimizes GC overhead

fastcache - fast thread-safe inmemory cache for big number of entries in Go Features Fast. Performance scales on multi-core CPUs. See benchmark result

Dec 30, 2022
An in-memory key:value store/cache (similar to Memcached) library for Go, suitable for single-machine applications.

go-cache go-cache is an in-memory key:value store/cache similar to memcached that is suitable for applications running on a single machine. Its major

Dec 29, 2022
Distributed cache with gossip peer membership enrollment.

Autocache Groupcache enhanced with memberlist for distributed peer discovery. TL;DR See /_example/ for usage. Run docker-compose -f _example/docker-co

Dec 8, 2022