Time Series Alerting Framework

Bosun

Bosun is a time series alerting framework developed by Stack Exchange. Scollector is a metric collection agent. Learn more at bosun.org.

Build Status

Building

bosun and scollector are found under the cmd directory. Run go build in the corresponding directories to build each project. There's also a Makefile available for most tasks.

Running

For a full stack with all dependencies, run docker-compose up from the docker directory. Don't forget to rebuild images and containers if you change the code:

$ cd docker
$ docker-compose down
$ docker-compose up --build

If you only need the dependencies (Redis, OpenTSDB, HBase) and would like to run Bosun on your machine directly (e.g. to attach a debugger), you can bring up the dependencies with these three commands from the repository's root:

$ docker run -p 6379:6379 --name redis redis:6
$ docker build -f docker/opentsdb.Dockerfile -t opentsdb .
$ docker run -p 4242:4242 --name opentsdb opentsdb

The OpenTSDB container will be reachable at http://localhost:4242. Redis listens on its default port 6379. Bosun, if brought up in a Docker container, is available at http://localhost:8070.

Developing

Install:

  • Run make deps and make testdeps to set up all dependencies.
  • Run make generate when new static assets (like JS and CSS files) are added or changed.

The w.sh script will automatically build and run bosun in a loop. It will update itself when go/js/ts files change, and it runs in read-only mode, not sending any alerts.

$ cd cmd/bosun
$ ./w.sh

Go Version:

  • See the version number in .travis.yml in the root of this repo for the version of Go to use. Generally speaking, you should be able to use newer versions of Go if you are able to build Bosun without error.

Miniprofiler:

  • Bosun includes miniprofiler in the web UI which can help with debugging. The key combination ALT-P will show miniprofiler. This allows you to see timings, as well as the raw queries sent to TSDBs.
Comments
  • Support influxdb

    Support influxdb

    Would help if bosun supported influxdb. I didn't find a bug tracking this, so here it is.

    Since I have multiple datasources sending data to influxdb. (collectd, statsite). It would keep my dependencies low if I could use the influxdb for bosun rather than change the entire system to openTSDB.

  • Multiple backends of the same type?

    Multiple backends of the same type?

    Is it possible to have multiple instances of the same type of backends? for example multiple InfluxDB backends or multiple ElasticSearch backends? I ask because I'm trying to pull in data from two separate instances but simply creating a duplicate key results in a config error: fatal: main.go:88: conf: bosun.config:2:0: at <influxHost = xx.xx.x...>: duplicate key: influxHost

  • Distributed alert checks to prevent high load spikes

    Distributed alert checks to prevent high load spikes

    This is a solution for #2065

    The idea behind this is simple. Every check run is slightly shifted so that the checks are distributed uniformly.

    For the subset of checks that run with the period T, a shift is added to every check. The shift ranges from 0 to T-1. The shifts are incremental. For example, if we have 6 checks every 5 mins (T=5). The shifts will be 0, 1, 2, 3, 4, 0. This way, without the patch 6 checks will happen at times 0, and 5; with the patch, two checks will happen at the time 0, one at 1, one at 2, and so on. The total number of checks and check period stay the same.

    Here is the test that shows the effect of the patch on system load. Note, that the majority of checks in this system have 5 mins period. patch_test

  • Config management

    Config management

    I want to deploy bosun as a dashboard & alerting system within my organization, but I feel like having config management being completely external to bosun is a major drawback. It would be super fantastic if it were possible to, entirely through the web interface, define, test, and commit a new alert, or to update an existing alert to tweak the parameters.

    Is anything like this in the works? How do you manage this in your existing deployments?

  • Support Dependencies

    Support Dependencies

    Problem: Something goes down which results in lots of other things being down, because of this, we get a lot of alerts.

    Common Examples:

    • A Network Partition: Some portion of hosts become unavailable from bosun's perspective
    • Host Goes Down: Everything monitored on that host becomes unavailable
    • Service dependencies: We expect some service to go down if another service goes down
    • Bosun can't query it's database (This is probably a different feature, but noting here nonetheless)

    Things I want to be able to do based on our config at Stack Exchange:

    • Have our host-based alert macro include detect if the host in Oregon (because the host name has "or-". So this is basically a dependency based on a lookup table
    • Have our host-based alerts not trigger if bosun is unable to ping the host (which would be another alert most likely)
    • Be able to have dependencies for alerts that may have no group.

    The status for any alert that is not triggering for an alert should be "unevaluated". This won't show up on the dashboard or trigger notifications.

    Two general approaches come to mind. The first is that dependencies require another alert. That other alert is run first, and the alert won't trigger based on the result of another alert. The other is that dependencies are an expression. I think the expression route only really makes sense if an alert itself can be used as an expression.

    Another possibility which I haven't thought much about is that alerts generate dependencies and not the other way around. So for example, an alert marks some tagset as something that should not be evaluated.

    Making Stuff Up....

    macro ping_location {
        template = ping.location
        $pq = max(q("sum:bosun.ping.timeout{dst_host=$loc*,host=$source}", "5m", ""))
        $grouped = t($pq,"")
        $hosts_timing_out = sum($grouped)
        $total_hosts = len($grouped)
        $percent_timeout = $hosts_timing_out / $total_hosts * 100
        crit = $percent_timeout > 10
    }
    
    #group is empty
    alert or_hosts_down {
        $source=ny-bosun01
        $loc = or-
        $name = OR Peak
        macro = ping_location
    }
    
    #Group is {dst_host=*}
    alert host_down {
       template = host_down
       cirt = max(q("sum:bosun.ping.timeout{dst_host=*", "5m", ""))
    }
    
    lookup location {
        entry host=or-* {
            alert = alert("or_hosts_down")
        }
        ...
    }
    
    macro host_based {
       #This makes it so alerts based on this macro that are host based won't trigger if 
       dependency = lookup("location", "alert") || alert("host_down")
       #Another idea here is that you can create tag synonyms for an alert. So instead of having to add this lookup function that translates, have a synonym feature of alerts and also global that says (consider this tag key to be the same as this tag key). This would also solve an issue with silences (i.e. silencing host=ny-web11 doesn't do anything for the haproxy alert that has hosts as svname). Another issue with that is the those alerts are not tag based, so we actually need inhibit in that case. 
    }
    
    
  • Bosun sending notifications for closed and inactive alerts

    Bosun sending notifications for closed and inactive alerts

    We have a very simple rule file, with 3 notifications (http post to PD and slack, and email) and a bunch of alert rules which trigger them. We are facing a weird issue wherein, the following happens:

    • alert triggers, sends notifications
    • a human acks the alert
    • human solves problem, alert becomes inactive
    • human closes the alert
    • notification still keeps triggering (alert is no where to be seen in the bosun UI/api) - forever!

    to explain it through logs, quite literally this is what we're seeing:

    2016/04/01 07:56:37 info: check.go:513: check alert masked.masked.write.rate.too.low start 2016/04/01 07:26:38 info: check.go:537: check alert masked.masked.write.rate.too.low done (1.378029647s): 0 crits, 0 warns, 0 unevaluated, 0 unknown 2016/04/01 07:26:38 info: alertRunner.go:55: runHistory on masked.masked.write.rate.too.low took 54.852815ms 2016/04/01 07:26:39 info: search.go:205: Backing up last data to redis 2016/04/01 07:28:20 info: notify.go:57: [bosun] critical: component xyz write rate too low: 0.00 records/minute in {adaptor=masked-masked-masked,colo=xyz,stream=writeAttributeToKafka} 2016/04/01 07:28:20 info: notify.go:57: [bosun] critical: component xyz write rate too low: 0.00 records/minute in {adaptor=masked-masked-masked,colo=xyz,stream=writeActivityToKafka} 2016/04/01 07:28:20 info: notify.go:57: [bosun] critical: component xyz write rate too low: 0.00 records/minute in {adaptor=masked-masked-masked,colo=xyz,stream=writeAttributeToKafka} 2016/04/01 07:28:20 info: notify.go:57: [bosun] critical: component xyz write rate too low: 0.00 records/minute in {adaptor=masked-masked-masked,colo=xyz,stream=writeActivityToKafka} 2016/04/01 07:28:20 info: notify.go:57: [bosun] critical: component xyz write rate too low: 0.00 records/minute in {adaptor=masked-masked-masked,colo=xyz,stream=writeAttributeToKafka} 2016/04/01 07:28:20 info: notify.go:57: [bosun] critical: component xyz write rate too low: 0.00 records/minute in {adaptor=masked-masked-masked,colo=xyz,stream=writeActivityToKafka} 2016/04/01 07:28:20 info: notify.go:115: relayed alert masked.masked.write.rate.too.low{adaptor=masked-masked-masked,colo=xyz,stream=writeAttributeToKafka} to [[email protected]] sucessfully. Subject: 148 bytes. Body: 3500 bytes. 2016/04/01 07:28:20 info: notify.go:115: relayed alert masked.masked.write.rate.too.low{adaptor=masked-masked-masked,colo=xyz,stream=writeActivityToKafka} to [[email protected]] sucessfully. Subject: 147 bytes. Body: 3497 bytes. 2016/04/01 07:28:20 info: notify.go:80: post notification successful for alert masked.masked.write.rate.too.low{adaptor=masked-masked-masked,colo=xyz,stream=writeAttributeToKafka}. Response code 200. 2016/04/01 07:28:20 info: notify.go:80: post notification successful for alert masked.masked.write.rate.too.low{adaptor=masked-masked-masked,colo=xyz,stream=writeActivityToKafka}. Response code 200. 2016/04/01 07:28:20 info: notify.go:80: post notification successful for alert masked.masked.write.rate.too.low{adaptor=masked-masked-masked,colo=xyz,stream=writeAttributeToKafka}. Response code 200. 2016/04/01 07:28:20 info: notify.go:80: post notification successful for alert masked.masked.write.rate.too.low{adaptor=masked-masked-masked,colo=xyz,stream=writeActivityToKafka}. Response code 200.

  • Use templates body as payload for notifications and subject for other HTML related stuff

    Use templates body as payload for notifications and subject for other HTML related stuff

    Hi all, as described in the docs, I'm using the templates subject as body for POSTing stuff to our hipchat bot. the problem I encounter is in Bosun main view (list of alerts) where the template subject is presented when clicking an alert for details.

    image

    Suggested is to use templates' body as payload for notification (POST notifications mainly). a flag can be also added to let the user which template will use the subject as payload and which will use the body.

    Thanks, Yarden

  • Add Recovery Emails

    Add Recovery Emails

    When an alert instance goes from (Unknown Warning or Critical) to Normal a recovery email should be sent.

    Considerations:

    • Should recovery templates be their own template? I think they should, and repeated logic can be done via include templates
      • Who to notify? The same notifications that were notified of the previous state.
      • notifications will need a no_recovery option. This is needed if we want to hook up alerts to pagerduty (don't want our phones being dialed to let us know that an issue is recovered, at that point we can rely on email)

    My main reservation about this feature is that users are more likely not to investigate an alert that is recovered, this is dangerous because the alert could be a latent issue. However, it is better to provide a better frictionless workflow than a road block. Bosun aims to provide all the tools needed for very informative notifications so good judgements can be made at times without needing to go to a console. Furthermore, we should also add acknowledgement notifications. This will be a way to inform all recipients of an alert that someone has made a decision about this alert and hopefully committed to an action (fixing the actual problem, or tuning the alert).

    Ack emails will be described in another issue.

    This feature needs discussion and review prior to implementation.

  • Memory leak in Bosun

    Memory leak in Bosun

    I updated our test servers to the latest version of bosun from https://github.com/bosun-monitor/bosun/releases/download/20150428222252/bosun-linux-amd64 After running for slightly less than a day, it stopped responding.

    The command line where I started it revealed:

     ./bosun-linux-amd64 -c=/data/bosun.conf
    2015/05/04 16:21:54 enabling syslog
    Killed
    

    Syslog (cat /var/log/messages |grep bosun) did not reveal any log messages in the hours before the crash.

    It looks like a memory leak. The graph of bosun.collect.alloc grew gradually from 200Mb after deploying the new version to 12Gb just before the "crash": rapid memory leak

    Looking back over the last week at the memory behaviour of the previous version, there was a similar memory growth pattern in the previous version too but at a much slower rate. The bottom graph shows gradual memory increasing over the course of a week followed by two rapid increases for the newer version. bosun memory leak memory only last 7 days

    Just for interest sake, here is a general Bosun dashboard; the other stats look reasonable. Although there is a high number of go routines after restarting Bosun this appears unrelated to the leak. bosun memory leak dashboard

    More information about our setup:

    • Backend: OpenTSDB
    • Data is being passed through Bosun to OpenTSDB (as visible from the dashboard)
    • We send data points every minute at a rate of about 37000 per minute
    • In addition scollector is submitting data from one machine monitoring openTSDB, elasticsearch, Bosun, Linux and os
    • The rule file is still a small prototype:
    httpListen = :8070
    tsdbHost = localhost:4242
    
    smtpHost = ******
    emailFrom = ******
    
    macro grafanaConfig {
        $grafanaHost = ******
    }
    
    notification emailIzak {
        email = [email protected]
        next = emailIzak
        timeout = 24h
    }
    
    
    ##################### Templates #######################
    
    
    template generic {
        body = `{{template "genericHeader" .}}
        {{template "genericDef" .}}
    
        {{template "genericTags" .}}
    
        {{template "genericComputation" .}}
    
         {{if .Alert.Vars.graph}}
         <h3>{{.Alert.Vars.graphTitle}}</h3>
        <p>{{.Graph .Alert.Vars.graph}}
        {{end}}`
    
        subject =  {{.Last.Status}}: {{.Alert.Name}} on instance {{.Group.serviceinstance}}
    }
    
    template genericHeader {   
        body = `
        <h3> Possible actions </h3>   
        {{if .Alert.Vars.note}}
            <p>{{.Alert.Vars.note}}
        {{end}}
         <p><a href="{{.Ack}}">Acknowledge alert</a>
    
        {{if .Alert.Vars.grafanaDash}}
            <p><a href="{{.Alert.Vars.grafanaDash}}"> View the relevant statistics dasboard </a>
        {{end}}
        `
    }
    
    template genericDef {
        body = `
        <h3> Details </h3>
        <p><strong>Alert definition:</strong>
        <table>
            <tr>
                <td>Name:</td>
                <td>{{replace .Alert.Name "." " " -1}}</td></tr>
            <tr>
                <td>Warn:</td>
                <td>{{.Alert.Warn}}</td></tr>
            <tr>
                <td>Crit:</td>
                <td>{{.Alert.Crit}}</td></tr>
        </table>`
    }
    
    template genericTags {
        body = `<p><strong>Tags</strong>
    
        <table>
            {{range $k, $v := .Group}}
                {{if eq $k "host"}}
                    <tr><td>{{$k}}</td><td><a href="{{$.HostView $v}}">{{$v}}</a></td></tr>
                {{else}}
                    <tr><td>{{$k}}</td><td>{{$v}}</td></tr>
                {{end}}
            {{end}}
        </table>`
    }
    
    template genericComputation {
        body = `
        <p><strong>Computation</strong>
    
        <table>
            {{range .Computations}}
                <tr><td><a href="{{$.Expr .Text}}">{{.Text}}</a></td><td>{{.Value}}</td></tr>
            {{end}}
        </table>`
    }
    
    template unkown {
        subject = {{.Name}}: {{.Group | len}} unknown alerts. 
        body = `
        <p>Unknown alerts imply no data is being recorded for their monitored time series. Therefore we cannot know what is happening. 
        <p>Time: {{.Time}}
        <p>Name: {{.Name}}
        <p>Alerts:
        {{range .Group}}
            <br>{{.}}
        {{end}}`
    }
    
    unknownTemplate = unkown
    
    
    #################### alerts #######################
    
    
    alert FlowRouterBytesZero {
        template = generic
        $query = "sum:bytes.bytes.counter.value{serviceinstance=*}"
    
        $note = The flow router has reported zero bytes in the last 2 minutes. This note should contain extra information specifying what action the operator should take to resolve it. 
        $graph =q($query, "24h", "")
        $graphTitle = Flow router traffic in the last 24 hours
        macro = grafanaConfig
        $grafanaDash = $grafanaHost/dashboard/db/per-flow-route-bytes-drill-down
    
        $avgBytesPer2Min = avg(q($query, "2m", ""))
        $avgBytesPer5Min = avg(q($query, "5m", ""))
    
        warn =  $avgBytesPer2Min == 0
        crit =  $avgBytesPer5Min == 0
        critNotification = emailIzak
    }
    
    
  • Add series aggregation DSL function `aggregate`

    Add series aggregation DSL function `aggregate`

    This PR adds an aggregate DSL function, which allows one to combine different series in a seriesSet using a specified aggregator (currently min, max, p50, avg).

    This is particularly useful when comparing data across different weeks (using the over) function. In our case, for anomaly detection, we want to compare the current day's data with an aggregated view of the same day in previous weeks. In particular, we want to compare each point in the last day to the median of each point in the corresponding day for the last 3 weeks, so that any anomalies that occurred in a previous week are ignored. This way we compare with a hypothetical "perfect" day.

    For example:

    $weeks = over("avg:10m-avg-zero:os.cpu", "24h", "1w", 3)
    $a = aggregate($weeks, "", "p50")
    merge($a, $q)
    

    Which looks like this:

    screen shot 2018-08-17 at 4 51 27 pm

    Or, if we wanted to combine series but maintain the region and color groups`, that query would look like this:

    $weeks = over("avg:10m-avg-zero:os.cpu{region=*,color=*}", "24h", "1w", 3)
    aggregate($weeks, "region,color", "p50")
    

    which would result in one merged series for each unique region/color combination.

    I am very happy to take suggestions for changes / improvements. With regards to naming the function, I would have probably chosen "merge", but since that is already taken, I went with the OpenTSDB terminology and used "aggregate".

  • Unable to query bosun after running for a minute

    Unable to query bosun after running for a minute

    I have installed Hbase, opentsdb and bosun on a machine running Centos7. I can see the bosun website fine, but any query I try to run from the graph page is giving some error. I've put the bosun output into a log file, and there are 2 kinds of errors that pop up. Sometime it's too many open files:

    2016/03/04 11:10:23 error: queue.go:102: Post http://localhost:8070/api/put: dial tcp 127.0.0.1:8070: socket: too many open files

    Sometimes it's just a timeout.

    2016/03/04 11:14:06 error: queue.go:102: Post http://localhost:8070/api/put: net/http: request canceled (Client.Timeout exceeded while awaiting headers)

    Sometimes restarting seems to help, other times not so much. The longest I've had bosun running without these errors is a day.

  • build(deps): bump github.com/aws/aws-sdk-go from 1.31.12 to 1.33.0

    build(deps): bump github.com/aws/aws-sdk-go from 1.31.12 to 1.33.0

    Bumps github.com/aws/aws-sdk-go from 1.31.12 to 1.33.0.

    Changelog

    Sourced from github.com/aws/aws-sdk-go's changelog.

    Release v1.33.0 (2020-07-01)

    Service Client Updates

    • service/appsync: Updates service API and documentation
    • service/chime: Updates service API and documentation
      • This release supports third party emergency call routing configuration for Amazon Chime Voice Connectors.
    • service/codebuild: Updates service API and documentation
      • Support build status config in project source
    • service/imagebuilder: Updates service API and documentation
    • service/rds: Updates service API
      • This release adds the exceptions KMSKeyNotAccessibleFault and InvalidDBClusterStateFault to the Amazon RDS ModifyDBInstance API.
    • service/securityhub: Updates service API and documentation

    SDK Features

    • service/s3/s3crypto: Introduces EncryptionClientV2 and DecryptionClientV2 encryption and decryption clients which support a new key wrapping algorithm kms+context. (#3403)
      • DecryptionClientV2 maintains the ability to decrypt objects encrypted using the EncryptionClient.
      • Please see s3crypto documentation for migration details.

    Release v1.32.13 (2020-06-30)

    Service Client Updates

    • service/codeguru-reviewer: Updates service API and documentation
    • service/comprehendmedical: Updates service API
    • service/ec2: Updates service API and documentation
      • Added support for tag-on-create for CreateVpc, CreateEgressOnlyInternetGateway, CreateSecurityGroup, CreateSubnet, CreateNetworkInterface, CreateNetworkAcl, CreateDhcpOptions and CreateInternetGateway. You can now specify tags when creating any of these resources. For more information about tagging, see AWS Tagging Strategies.
    • service/ecr: Updates service API and documentation
      • Add a new parameter (ImageDigest) and a new exception (ImageDigestDoesNotMatchException) to PutImage API to support pushing image by digest.
    • service/rds: Updates service documentation
      • Documentation updates for rds

    Release v1.32.12 (2020-06-29)

    Service Client Updates

    • service/autoscaling: Updates service documentation and examples
      • Documentation updates for Amazon EC2 Auto Scaling.
    • service/codeguruprofiler: Updates service API, documentation, and paginators
    • service/codestar-connections: Updates service API, documentation, and paginators
    • service/ec2: Updates service API, documentation, and paginators
      • Virtual Private Cloud (VPC) customers can now create and manage their own Prefix Lists to simplify VPC configurations.

    Release v1.32.11 (2020-06-26)

    Service Client Updates

    • service/cloudformation: Updates service API and documentation
      • ListStackInstances and DescribeStackInstance now return a new StackInstanceStatus object that contains DetailedStatus values: a disambiguation of the more generic Status value. ListStackInstances output can now be filtered on DetailedStatus using the new Filters parameter.
    • service/cognito-idp: Updates service API

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

  • Fix false return error message for binary node validation for #2505

    Fix false return error message for binary node validation for #2505

    https://github.com/bosun-monitor/bosun/issues/2505


    Description

    Fixes #2505 (fill in)

    Type of change

    • [x] Bug fix (non-breaking change which fixes an issue)
    • [ ] New feature (non-breaking change which adds functionality)
    • [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
    • [ ] This change requires a documentation update

    How has this been tested?

    • [ ] Test A
    • [ ] Test B

    Checklist:

    • [x] This contribution follows the project's code of conduct
    • [x] This contribution follows the project's contributing guidelines
    • [x] My code follows the style guidelines of this project
    • [x] I have performed a self-review of my own code
    • [ ] I have commented my code, particularly in hard-to-understand areas
    • [ ] I have made corresponding changes to the documentation
    • [ ] I have added tests that prove my fix is effective or that my feature works
    • [x] New and existing unit tests pass locally with my changes
    • [ ] Any dependent changes have been merged and published in downstream modules
  • Added

    Added "* L4TOUT" to haproxyCheckStatus


    Description

    Scollector did not manage to collect data from HAProxy (HAProxy version 2.0.13-2ubuntu0.5). Got error:

    Apr 28 16:26:34 ServerName scollector[1741859]: error: interval.go:65: haproxy-1-http://localhost:1936/;csv: unknown check status * L4TOUT
    Apr 28 16:26:49 ServerName scollector[1741859]: error: interval.go:65: haproxy-1-http://localhost:1936/;csv: unknown check status * L4TOUT
    

    Print from HAProxy: image

    Simply added "* L4TOUT" so that its a valid check status for haproxyCheckStatus

    Type of change

    • [x] Bug fix (non-breaking change which fixes an issue)
    • [ ] New feature (non-breaking change which adds functionality)
    • [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
    • [ ] This change requires a documentation update

    How has this been tested?

    • [x] HAProxy collection now works again for HAProxy version 2.0.13-2ubuntu0.5

    Checklist:

    • [x] This contribution follows the project's code of conduct
    • [x] This contribution follows the project's contributing guidelines
    • [ ] My code follows the style guidelines of this project
    • [x] I have performed a self-review of my own code
    • [ ] I have commented my code, particularly in hard-to-understand areas
    • [ ] I have made corresponding changes to the documentation
    • [ ] I have added tests that prove my fix is effective or that my feature works
    • [ ] New and existing unit tests pass locally with my changes
    • [ ] Any dependent changes have been merged and published in downstream modules
  • Clarify release status

    Clarify release status

    We package this for NixOS, and we like to use the latest stable release from upstream.

    https://github.com/bosun-monitor/bosun/releases/tag/0.8.0-preview is listed as the latest release on GitHub. Is it a stable release or should it be marked pre-release? I ask because it has the "-preview" suffix attached to it, making me think it is an unstable release.

An Alert notification service is an application which can receive alerts from certain alerting systems like System_X and System_Y and send these alerts to developers in the form of SMS and emails.

Alert-System An Alert notification service is an application which can receive alerts from certain alerting systems like System_X and System_Y and sen

Dec 10, 2021
CPU usage percentage is the ratio of the total time the CPU was active, to the elapsed time of the clock on your wall.

Docker-Kubernetes-Container-CPU-Utilization Implementing CPU Load goroutine requires the user to call the goroutine from the main file. go CPULoadCalc

Dec 15, 2021
terraform-plugin-mux Example (framework + framework)

Terraform Provider Scaffolding (Terraform Plugin Framework) This template repository is built on the Terraform Plugin Framework. The template reposito

Feb 8, 2022
Monitor your Website and APIs from your Computer. Get Notified through Slack, E-mail when your server is down or response time is more than expected.
Monitor your Website and APIs from your Computer. Get Notified through Slack, E-mail when your server is down or response time is more than expected.

StatusOK Monitor your Website and APIs from your computer.Get notified through Slack or E-mail when your server is down or response time is more than

Dec 27, 2022
Open Source runtime tool which help to detect malware code execution and run time mis-configuration change on a kubernetes cluster
Open Source runtime tool which help to detect malware code execution and run time mis-configuration change on a kubernetes cluster

Kube-Knark Project Trace your kubernetes runtime !! Kube-Knark is an open source tracer uses pcap & ebpf technology to perform runtime tracing on a de

Sep 19, 2022
After approve this contract, you can use the contract to adventure with multiple characters at the same time
After approve this contract, you can use the contract to adventure with multiple characters at the same time

MultipleRarity 又又又更新了! MultipleRarity最新版:0x8ACcaa4b940eaFC41b33159027cDBDb4A567d442 注:角色冷却时间不统一时,可以不用管能不能冒险或升级,合约内部加了筛选,但消耗的gas增加了一点点,介意的可以使用常规修复版。 Mu

Nov 19, 2021
Reconstruct Open API Specifications from real-time workload traffic seamlessly
Reconstruct Open API Specifications from real-time workload traffic seamlessly

Reconstruct Open API Specifications from real-time workload traffic seamlessly: Capture all API traffic in an existing environment using a service-mes

Jan 1, 2023
StaticBackend is a simple backend server API handling user mgmt, database, storage and real-time component
StaticBackend is a simple backend server API handling user mgmt, database, storage and real-time component

StaticBackend is a simple backend that handles user management, database, file storage, forms, and real-time experiences via channel/topic-based communication for web and mobile applications.

Jan 7, 2023
Simple Kubernetes real-time dashboard and management.
Simple Kubernetes real-time dashboard and management.

Skooner - Kubernetes Dashboard We are changing our name from k8dash to Skooner! Please bear with us as we update our documentation and codebase to ref

Dec 28, 2022
Hardening a sketchy containerized application one step at a time

Road to Secure Kubernetes Hardening a containerized application one step at a time This repository hosts a tutorial on security hardening a containeri

Jun 8, 2022
The Oracle Database Operator for Kubernetes (a.k.a. OraOperator) helps developers, DBAs, DevOps and GitOps teams reduce the time and complexity of deploying and managing Oracle Databases

The Oracle Database Operator for Kubernetes (a.k.a. OraOperator) helps developers, DBAs, DevOps and GitOps teams reduce the time and complexity of deploying and managing Oracle Databases. It eliminates the dependency on a human operator or administrator for the majority of database operations.

Dec 14, 2022
A kubernetes operator sample generated by kubebuilder , which run cmd in pod on specified time

init kubebuilder init --domain github.com --repo github.com/tonyshanc/sample-operator-v2 kubebuilder create api --group sample --version v1 --kind At

Jan 25, 2022
Package trn introduces a Range type with useful methods to perform complex operations over time ranges

Time Ranges Package trn introduces a Range type with useful methods to perform c

Aug 18, 2022
A simple CLI and API client for One-Time Secret

OTS Go client otsgo is a simple CLI and API client for One-Time Secret written i

Dec 27, 2021
A simple go application that uses Youtube Data API V3 to show the real-time stats for a youtube channel such as the subs, views, avg. earnings etc.
A simple go application that uses Youtube Data API V3 to show the real-time stats for a youtube channel such as the subs, views, avg. earnings etc.

Youtube-channel-monitor A simple go application that uses Youtube Data API V3 to show the real-time stats for a youtube channel such as the subs, view

Dec 30, 2021
Huawei-push-authorizator - Huawei Push Kit authorizator in time

huawei-push-authorizator Huawei Push Kit authorizator in time Why? To send push

Jan 3, 2022
S3pd - CLI utility that downloads multiple s3 objects at a time, with multiple range-requests issued per object

S3 Parallel Downloader CLI utility that downloads multiple s3 objects at a time,

May 13, 2022
A kubectl plugin to query multiple namespace at the same time.

kubemulti A kubectl plugin to query multiple namespace at the same time. $ kubemulti get pods -n cdi -n default NAMESPACE NAME

Mar 1, 2022