Fast, realtime regex-extraction, and aggregation into common formats such as histograms, numerical summaries, tables, and more!

Chris LaPointe

Last update: Dec 29, 2022

Comments: 13

rare

A file scanner/regex extractor and realtime summarizor.

Supports various CLI-based graphing and metric formats (histogram, table, etc).

Features

Multiple summary formats including: filter (like grep), histogram, and numerical analysis
File glob expansions (eg /var/log/* or /var/log/*/*.log) and -R
Optional gzip decompression (with -z)
Following -f or re-open following -F (use --poll to poll)
Ignoring lines that match an expression
Aggregating and realtime summary (Don't have to wait for all data to be scanned)
Multi-threaded reading, parsing, and aggregation
Color-coded outputs (optionally)
Pipe support (stdin for reading, stdout will disable color) eg. tail -f | rare ...

Installation

Notes on versions: Besides your standard OS versions, there is an additional pcre build which is 4x faster than go's re2 implementation. In order to use this, you must make sure that libpcre2 is installed (eg apt install libpcre2-8-0). Right now, it is only bundled with the linux distribution.

Manual

Download appropriate binary from Releases, unzip, and put it in /bin

Homebrew

brew tap zix99/rare
brew install rare

From code

Clone the repo, and:

Requires GO 1.11 or higher (Uses go modules)

go get ./...

# Pack documentation (Only necessary for release builds)
go run github.com/gobuffalo/packr/v2/packr2

# Build binary
go build .

# OR, with experimental features
go build -tags experimental .

Available tags:

experimental Enable experimental features (eg. fuzzy search)
pcre2 Enables PCRE 2 (v10) where able. Currently linux only

Docs

All documentation may be found here, in the docs/ folder, and by running rare docs (embedded docs/ folder)

You can also see a dump of the CLI options at cli-help.md

Example

Extract status codes from nginx logs

$ rare histo -m '"(\w{3,4}) ([A-Za-z0-9/.]+).*" (\d{3})' -e '{3} {1}' access.log
200 GET                          160663
404 GET                          857
304 GET                          53
200 HEAD                         18
403 GET                          14

Extract number of bytes sent by bucket, and format

This shows an example of how to bucket the values into size of 1000. In this case, it doesn't make sense to see the histogram by number of bytes, but we might want to know the ratio of various orders-of-magnitudes.

$ rare histo -m '"(\w{3,4}) ([A-Za-z0-9/.]+).*" (\d{3}) (\d+)' -e "{bucket {4} 10000}" -n 10 access.log -b
0                   144239     ||||||||||||||||||||||||||||||||||||||||||||||||||
190000              2599       
10000               1290       
180000              821        
20000               496        
30000               445        
40000               440        
200000              427        
140000              323        
70000               222        
Matched: 161622 / 161622
Groups:  1203

Output Formats

Histogram (histo)

The histogram format outputs an aggregation by counting the occurences of an extracted match. That is to say, on every line a regex will be matched (or not), and the matched groups can be used to extract and build a key, that will act as the bucketing name.

NAME:
   rare histogram - Summarize results by extracting them to a histogram

USAGE:
   rare histogram [command options] <-|filename|glob...>

DESCRIPTION:
   Generates a live-updating histogram of the extracted information from a file
    Each line in the file will be matched, any the matching part extracted
    as a key and counted.
    If an extraction expression is provided with -e, that will be used
    as the key instead

OPTIONS:
   --follow, -f                 Read appended data as file grows
   --reopen, -F                 Same as -f, but will reopen recreated files
   --poll                       When following a file, poll for changes rather than using inotify
   --posix, -p                  Compile regex as against posix standard
   --match value, -m value      Regex to create match groups to summarize on (default: ".*")
   --extract value, -e value    Expression that will generate the key to group by (default: "{0}")
   --gunzip, -z                 Attempt to decompress file when reading
   --batch value                Specifies io batching size. Set to 1 for immediate input (default: 1000)
   --workers value, -w value    Set number of data processors (default: 5)
   --readers value, --wr value  Sets the number of concurrent readers (Infinite when -f) (default: 3)
   --ignore value, -i value     Ignore a match given a truthy expression (Can have multiple)
   --recursive, -R              Recursively walk a non-globbing path and search for plain-files
   --bars, -b                   Display bars as part of histogram
   --num value, -n value        Number of elements to display (default: 5)
   --reverse                    Reverses the display sort-order
   --sortkey, --sk              Sort by key, rather than value

Filter (filter)

Filter is a command used to match and (optionally) extract that match without any aggregation. It's effectively a grep or a combination of grep, awk, and/or sed.

NAME:
   rare filter - Filter incoming results with search criteria, and output raw matches

USAGE:
   rare filter [command options] <-|filename|glob...>

DESCRIPTION:
   Filters incoming results by a regex, and output the match or an extracted expression.
    Unable to output contextual information due to the application's parallelism.  Use grep if you
    need that

OPTIONS:
   --follow, -f                 Read appended data as file grows
   --reopen, -F                 Same as -f, but will reopen recreated files
   --poll                       When following a file, poll for changes rather than using inotify
   --posix, -p                  Compile regex as against posix standard
   --match value, -m value      Regex to create match groups to summarize on (default: ".*")
   --extract value, -e value    Expression that will generate the key to group by (default: "{0}")
   --gunzip, -z                 Attempt to decompress file when reading
   --batch value                Specifies io batching size. Set to 1 for immediate input (default: 1000)
   --workers value, -w value    Set number of data processors (default: 5)
   --readers value, --wr value  Sets the number of concurrent readers (Infinite when -f) (default: 3)
   --ignore value, -i value     Ignore a match given a truthy expression (Can have multiple)
   --recursive, -R              Recursively walk a non-globbing path and search for plain-files
   --line, -l                   Output line numbers

Numerical Analysis

This command will extract a number from logs and run basic analysis on that number (Such as mean, median, mode, and quantiles).

NAME:
   rare analyze - Numerical analysis on a set of filtered data

USAGE:
   rare analyze [command options] <-|filename|glob...>

DESCRIPTION:
   Treat every extracted expression as a numerical input, and run analysis
    on that input.  Will extract mean, median, mode, min, max.  If specifying --extra
    will also extract std deviation, and quantiles

OPTIONS:
   --follow, -f                 Read appended data as file grows
   --reopen, -F                 Same as -f, but will reopen recreated files
   --poll                       When following a file, poll for changes rather than using inotify
   --posix, -p                  Compile regex as against posix standard
   --match value, -m value      Regex to create match groups to summarize on (default: ".*")
   --extract value, -e value    Expression that will generate the key to group by (default: "{0}")
   --gunzip, -z                 Attempt to decompress file when reading
   --batch value                Specifies io batching size. Set to 1 for immediate input (default: 1000)
   --workers value, -w value    Set number of data processors (default: 5)
   --readers value, --wr value  Sets the number of concurrent readers (Infinite when -f) (default: 3)
   --ignore value, -i value     Ignore a match given a truthy expression (Can have multiple)
   --recursive, -R              Recursively walk a non-globbing path and search for plain-files
   --extra                      Displays extra analysis on the data (Requires more memory and cpu)
   --reverse, -r                Reverses the numerical series when ordered-analysis takes place (eg Quantile)
   --quantile value, -q value   Adds a quantile to the output set. Requires --extra (default: "90", "99", "99.9")

Example:

$ go run *.go --color analyze -m '"(\w{3,4}) ([A-Za-z0-9/.@_-]+).*" (\d{3}) (\d+)' -e "{4}" testdata/access.log 
Samples:  161,622
Mean:     2,566,283.9616
Min:      0.0000
Max:      1,198,677,592.0000

Median:   1,021.0000
Mode:     1,021.0000
P90:      19,506.0000
P99:      64,757,808.0000
P99.9:    395,186,166.0000
Matched: 161,622 / 161,622

Tabulate

Create a 2D view (table) of data extracted from a file. Expression needs to yield a two dimensions separated by a tab. Can either use \x00 or the {$ a b} helper. First element is the column name, followed by the row name.

NAME:
   rare tabulate - Create a 2D summarizing table of extracted data

USAGE:
   rare tabulate [command options] <-|filename|glob...>

DESCRIPTION:
   Summarizes the extracted data as a 2D data table.
    The key is provided in the expression, and should be separated by a tab \x00
    character or via {$ a b} Where a is the column header, and b is the row

OPTIONS:
   --follow, -f                 Read appended data as file grows
   --reopen, -F                 Same as -f, but will reopen recreated files
   --poll                       When following a file, poll for changes rather than using inotify
   --posix, -p                  Compile regex as against posix standard
   --match value, -m value      Regex to create match groups to summarize on (default: ".*")
   --extract value, -e value    Expression that will generate the key to group by (default: "{0}")
   --gunzip, -z                 Attempt to decompress file when reading
   --batch value                Specifies io batching size. Set to 1 for immediate input (default: 1000)
   --workers value, -w value    Set number of data processors (default: 5)
   --readers value, --wr value  Sets the number of concurrent readers (Infinite when -f) (default: 3)
   --ignore value, -i value     Ignore a match given a truthy expression (Can have multiple)
   --recursive, -R              Recursively walk a non-globbing path and search for plain-files
   --delim value                Character to tabulate on. Use {$} helper by default (default: "\x00")
   --num value, -n value        Number of elements to display (default: 20)
   --cols value                 Number of columns to display (default: 10)
   --sortkey, --sk              Sort rows by key name rather than by values

Example:

$ rare tabulate -m "(\d{3}) (\d+)" -e "{$ {1} {bucket {2} 100000}}" -sk access.log

         200      404      304      403      301      206      
0        153,271  860      53       14       12       2                 
1000000  796      0        0        0        0        0                 
2000000  513      0        0        0        0        0                 
7000000  262      0        0        0        0        0                 
4000000  257      0        0        0        0        0                 
6000000  221      0        0        0        0        0                 
5000000  218      0        0        0        0        0                 
9000000  206      0        0        0        0        0                 
3000000  202      0        0        0        0        0                 
10000000 201      0        0        0        0        0                 
11000000 190      0        0        0        0        0                 
21000000 142      0        0        0        0        0                 
15000000 138      0        0        0        0        0                 
8000000  137      0        0        0        0        0                 
22000000 123      0        0        0        0        0                 
14000000 121      0        0        0        0        0                 
16000000 110      0        0        0        0        0                 
17000000 99       0        0        0        0        0                 
34000000 91       0        0        0        0        0                 
Matched: 161,622 / 161,622
Rows: 223; Cols: 6

Performance Benchmarking

I know there are different solutions, and rare accomplishes summarization in a way that grep, awk, etc can't, however I think it's worth analyzing the performance of this tool vs standard tools to show that it's at least as good.

It's worth noting that in many of these results rare is just as fast, but part of that reason is that it consumes CPU in a more efficient way (go is great at parallelization). So take that into account, for better or worse.

All tests were done on ~200MB of gzip'd nginx logs spread acorss 10 files.

Each program was run 3 times and the last time was taken (to make sure things were cached equally).

zcat & grep

$ time zcat testdata/* | grep -Poa '" (\d{3})' | wc -l
1131354

real	0m0.990s
user	0m1.480s
sys	0m0.080s

$ time zcat testdata/* | grep -Poa '" 200' > /dev/null

real	0m1.136s
user	0m1.644s
sys	0m0.044s

I believe the largest holdup here is the fact that zcat will pass all the data to grep via a synchronous pipe, whereas rare can process everything in async batches. Using pigz instead didn't yield different results, but on single-file results they did perform comparibly.

Silver Searcher (ag)

$ ag --version
ag version 0.31.0

Features:
  +jit +lzma +zlib

$ time ag -z '" (\d{3})' testdata/* | wc -l
1131354

real	0m3.944s
user	0m3.904s
sys	0m0.152s

rare

$ rare -v
rare version 0.1.16, 11ca2bfc4ad35683c59929a74ad023cc762a29ae

$ time rare filter -m '" (\d{3})' -e "{1}" -z testdata/* | wc -l
Matched: 1,131,354 / 3,638,594
1131354

real	0m0.927s
user	0m1.764s
sys	0m1.144s

$ time rare histo -m '" (\d{3})' -e "{1}" -z testdata/*
200                 1,124,767 
404                 6,020     
304                 371       
403                 98        
301                 84        

Matched: 1,131,354 / 3,638,594
Groups:  6

real	0m0.284s
user	0m1.648s
sys	0m0.048s

Development

New additions to rare should pass the following checks

Documentation for any new functionality or expression changes
Before and after CPU and memory benchmarking for core additions (Expressions, aggregation, benchmarking, and rendering)
Limit memory allocations (preferably 0!) in the high-throughput functions
Tests, and if it makes sense, benchmarks of a given function

Running/Testing

go run .
go test ./...

Profiling

New high-throughput changes should be performance benchmarked.

To Benchmark:

go run . --profile out <your test code>
go tool pprof -http=:8080 out.cpu.prof # CPU
go tool pprof -http=:8080 out_num.prof # Memory

License

Copyright (C) 2019  Christopher LaPointe

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program.  If not, see <https://www.gnu.org/licenses/>.

Owner

Chris LaPointe

Full stack Software Engineer at TripAdvisor, focusing on common services and backend infrastructure. I like to make other engineers live's easier.

https://github.com/zix99/rare

Comments

Panic in coloring logic when using nested groups in 'filter' mode
First of all, let me say, this is an awesome project. Nice work! I was in the process of writing something similar but much much worse; I think I might use this as a library instead!

I've manage to cause a panic in the following scenario:

echo '1,2,3,4,5,6,7,8,9,0' | rare filter -m "(^[^,]*)(,([^,]*)){5}" panic: runtime error: slice bounds out of range [11:10] goroutine 1 [running]: rare/pkg/color.WrapIndices(0xc000500000, 0x13, 0xc0000b4590, 0x6, 0x6, 0x206860, 0x1207860) /Users/ondrejb/Documents/git/rare/pkg/color/coloring.go:95 +0x832 rare/cmd.filterFunction(0xc00024c840, 0x0, 0xc0004752d0) /Users/ondrejb/Documents/git/rare/cmd/filter.go:33 +0x276 github.com/urfave/cli.HandleAction(0x13b20c0, 0x144c828, 0xc00024c840, 0xc00024c840, 0x0) /Users/ondrejb/go/pkg/mod/github.com/urfave/[email protected]/app.go:523 +0xfd github.com/urfave/cli.Command.Run(0x142c0a8, 0x6, 0x1429f04, 0x1, 0x0, 0x0, 0x0, 0x1441dfa, 0x44, 0x0, ...) /Users/ondrejb/go/pkg/mod/github.com/urfave/[email protected]/command.go:174 +0x58e github.com/urfave/cli.(*App).Run(0xc0004d6000, 0xc000090040, 0x4, 0x4, 0x0, 0x0) /Users/ondrejb/go/pkg/mod/github.com/urfave/[email protected]/app.go:276 +0x7d4 main.cliMain(0xc000090040, 0x4, 0x4, 0x0, 0x0) /Users/ondrejb/Documents/git/rare/main.go:101 +0x666 main.main() /Users/ondrejb/Documents/git/rare/main.go:105 +0x49

To save you parsing manually, the three groups here are the 1st column of a csv, the 6th column but including the leading comma, and the 6th field without the leading comma.

When I make the second (outer) group non capturing, ie. (^[^,]*)(?:,([^,]*)){5} everything works fine and I get the 1st and 6th field (groups {1} and {3}). Obviously, if I use the --nocolor option, or if I use an expression eg. -e '{1} {2} {3}, everything is fine.

I haven't looked deep into the code yet, but obviously since the match groups overlap, the starting index of the inner group lies inside the outer group, and the colouring logic doesn't account for this scenario.

I'd suggest that the inner match should take precedence when colouring the matching text (ie. inner match colours "overwrite" the outer group)

I'll have a crack at making a pull request to fix this myself soon.

Invalid syntax in tap

Hey,

Just tried to install from Homebrew and I get an error:

❯ brew tap zix99/rare
==> Tapping zix99/rare
Cloning into '/usr/local/Homebrew/Library/Taps/zix99/homebrew-rare'...
remote: Enumerating objects: 45, done.
remote: Counting objects: 100% (45/45), done.
remote: Compressing objects: 100% (30/30), done.
remote: Total 45 (delta 14), reused 0 (delta 0), pack-reused 0
Receiving objects: 100% (45/45), 6.65 KiB | 3.33 MiB/s, done.
Resolving deltas: 100% (14/14), done.
Error: Invalid formula: /usr/local/Homebrew/Library/Taps/zix99/homebrew-rare/rare.rb
rare: Calling bottle :unneeded is disabled! There is no replacement.
Please report this issue to the zix99/rare tap (not Homebrew/brew or Homebrew/core):
  /usr/local/Homebrew/Library/Taps/zix99/homebrew-rare/rare.rb:9

Error: Cannot tap zix99/rare: invalid syntax in tap!

Is there anything I need to do other than

brew tap zix99/rare && brew install rare

I'm doing this on a 2019 Mac Book Pro (Intel) Thank you

Sort heatmap columns numerically

First off, great work with the heatmaps feature!

I've been using heatmaps for numerical data, in particular nginx response times. I usually convert them it integers in rare by matching eg 0.053 (seconds) with (\d+)\.(\d{3}) and then using the expression {sumi {multi {1} 1000} {3}} to convert to milliseconds.

I've found that the table and heatmap sort the column names as strings, and not numerically if possible. The results in meaningless heatmaps:

I made a small change to a local checkout of rare to basically test if the column headers could be converted to integers and then sort them numerically if they can:

index 1be0be8..040b889 100644
--- a/pkg/aggregation/table.go
+++ b/pkg/aggregation/table.go
@@ -103,8 +103,23 @@ func (s *TableAggregator) OrderedColumns() []string {
 func (s *TableAggregator) OrderedColumnsByName() []string {
        keys := s.Columns()
 
+       // check if keys can be sorted numerically:
+       numeric := true
+       for _,k := range keys {
+               if _, err := strconv.Atoi(k); err != nil {
+                       numeric = false
+                       break
+               }
+       }
+
        sort.Slice(keys, func(i, j int) bool {
-               return keys[i] < keys[j]
+               if numeric {
+                       k0, _ := strconv.Atoi(keys[i])
+                       k1, _ := strconv.Atoi(keys[j])
+                       return k0 < k1
+               } else {
+                       return keys[i] < keys[j]
+               }
        })
 
        return keys

The same heatmap is now much more meaningful:

Would you be interested in incorporating the above diff into rare?

Bump mkdocs from 1.2.1 to 1.2.3
Bumps mkdocs from 1.2.1 to 1.2.3.

Release notes

Sourced from mkdocs's releases.

1.2.3

MkDocs 1.2.3 is a bugfix release for MkDocs 1.2.

Aside: MkDocs has a new chat room on Gitter/Matrix. More details.

Improvements:

Built-in themes now also support these languages:

Simplified Chinese (#2497)

Japanese (#2525)

Brazilian Portuguese (#2535)

Spanish (#2545, previously #2396)

Third-party plugins will take precedence over built-in plugins with the same name (#2591)

Bugfix: Fix ability to load translations for some languages: core support (#2565) and search plugin support with fallbacks (#2602)

Bugfix (regression in 1.2): Prevent directory traversal in the dev server (#2604)

Bugfix (regression in 1.2): Prevent webserver warnings from being treated as a build failure in strict mode (#2607)

Bugfix: Correctly print colorful messages in the terminal on Windows (#2606)

Bugfix: Python version 3.10 was displayed incorrectly in --version (#2618)

Other small improvements; see commit log.

1.2.2

MkDocs 1.2.2 is a bugfix release for MkDocs 1.2 -- make sure you've seen the "major" release notes as well.

Bugfix (regression in 1.2): Fix serving files/paths with Unicode characters (#2464)

Bugfix (regression in 1.2): Revert livereload file watching to use polling observer (#2477)

This had to be done to reasonably support usages that span virtual filesystems such as non-native Docker and network mounts.

This goes back to the polling approach, very similar to that was always used prior, meaning most of the same downsides with latency and CPU usage.

Revert from 1.2: Remove the requirement of a site_url config and the restriction on use_directory_urls (#2490)

Bugfix (regression in 1.2): Don't require trailing slash in the URL when serving a directory index in mkdocs serve server (#2507)

Instead of showing a 404 error, detect if it's a directory and redirect to a path with a trailing slash added, like before.

Bugfix: Fix gh_deploy with config-file in the current directory (#2481)

Bugfix: Fix reversed breadcrumbs in "readthedocs" theme (#2179)

Allow "mkdocs.yaml" as the file name when '--config' is not passed (#2478)

... (truncated)

Commits

d167eab Release 1.2.3 (#2614)

5629b09 Re-format translation files to pass a lint check (#2621)

2c4679b Re-format translation files to pass a lint check (#2620)

9262cc5 Fix the code to abbreviate Python's version (#2618)

8345850 Add hint about -f/--config-file in configuration documentation (#2616)

815af48 Added translation for Brazilian Portuguese (#2535)

6563439 Update contact instructions: announce chat, preference for issues (#2610)

6b72eef We can again announce support of zh_CN locale (#2609)

b18ae29 Drop assert_mock_called_once compat method from tests (#2611)

7a27572 Isolate strict warning counter to just the ongoing build (#2607)

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot use these labels will set the current labels as the default for future PRs for this repo and language

@dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language

@dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language

@dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the Security Alerts page.
Regression: 'rare filter' now ignores '-e' form of '--extract' flag

Before:

Now:

Now but with --extract:

I've taken screenshots to preserve image highlighting; here's the cut-and-paste friendly test case: echo "a: 1 b: 2 c: 3" | ./rare f -m 'a: (\d+) b: (\d+) c: (\d+)' -e '{1} {2} {3}'

Everything seems to work fine using the short '-e' with histogram, analyze, etc; it's just filter that is impacted.
Optimize memory usage by reducing buffered batches by default

With default settings, rare used ~50MB consistently. These tweaks and settings lower it to ~10MB while maintaining performance. For io-burst systems, you can tweak up the buffered batches via CLI.
Tables2
Refactor how table color-coding is handled

More dense information in table

At totals column/row

Change sorting to be consistent with other display methods
Follow reader

Replace gotail with internal file following is a significant performance improvement. Previous benchmark maxed out at 20-30 MB/sec (Likely mostly because of the single-batched channel from gotail) New benchmarks max out at 500 MB/sec, and seem limited by disk at that point
Readthru

Introduce immediate-readahead, providing more immediate results at similar performance to readahead, and with the same memory characteristics. Resolves #61
Bump github.com/tidwall/gjson from 1.3.5 to 1.9.3
Bumps github.com/tidwall/gjson from 1.3.5 to 1.9.3.

Commits

77a57fd Limit the complexity of "like" queries that match on a pattern.

590010f Update match dependency

61273bf Update dependency

78289be Create FUNDING.yml

75046d2 Update comments

5827eb3 Merge pull request #226 from ifraixedes/if/fix-doc-map-method

807836a Minor update

44b8c19 Minor update to test

160fb9d Updated comments

52919fa Merge pull request #222 from sspaink/arrayindex

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot use these labels will set the current labels as the default for future PRs for this repo and language

@dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language

@dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language

@dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the Security Alerts page.
Bump to go 1.17

Initial benchmarks show the callsite improvements are speding up rare by a few percentage points. Will post some benchmarks soon.

Histogram benchmark (1.5 GB of logs): go 1.16: 23s real; 1m17s user time go 1.17: 20s real; 1m7s user time

About a 12% savings on user-time at a high level.
Why not use grok?

Hello, grok is a generally common log parsing language that allows for a clear combination of regular expressions. It is used in tools like logstash and vector. I was just curious why you opted for traditional regex and match groups rather than using grok.

Thanks, Cam.
Memory and CPU usage

I'm curious the Readme could included the memory and CPU usage between standard unix tools ga and rare?

Another take to benchmark on embedded device is useful.

Fast, realtime regex-extraction, and aggregation into common formats such as histograms, numerical summaries, tables, and more!

rare

Features

Installation

Manual

Homebrew

From code

Docs

Example

Extract status codes from nginx logs

Extract number of bytes sent by bucket, and format

Output Formats

Histogram (histo)

Filter (filter)

Numerical Analysis

Tabulate

Performance Benchmarking

zcat & grep

Silver Searcher (ag)

rare

Development

Running/Testing

Profiling

License

Owner

Chris LaPointe

Comments

Panic in coloring logic when using nested groups in 'filter' mode

Invalid syntax in tap

Sort heatmap columns numerically

Bump mkdocs from 1.2.1 to 1.2.3

1.2.3

1.2.2

Regression: 'rare filter' now ignores '-e' form of '--extract' flag

Optimize memory usage by reducing buffered batches by default

Tables2

Follow reader

Readthru

Bump github.com/tidwall/gjson from 1.3.5 to 1.9.3

Bump to go 1.17

Why not use grok?

Memory and CPU usage

Related tags

A CLI tool which loads data from yaml files into the Google Cloud Spanner tables

A Go package for converting RGB and other color formats/colorspaces into DMC thread colors (DMC color name and floss number)

Brigodier is a command parser & dispatcher, designed and developed for command lines such as for Discord bots or Minecraft chat commands. It is a complete port from Mojang's "brigadier" into Go.

Utilities to prettify console output of tables, lists, progress-bars, text, etc.

Stonks is a terminal based stock visualizer and tracker that displays realtime stocks in graph format in a terminal.

Loc2Midi - This project uses realtime location data to control audio effects plugins in order to simulate movement throughout a virtual soundscape

Use the command to convert arbitrary formats to Go Struct (including json, toml, yaml, etc.)

A go library for easy configure and run command chains. Such like pipelining in unix shells.

ReverseSSH - a statically-linked ssh server with reverse shell functionality for CTFs and such

VIP video downloader, such as: iqiyi, youku, qq, ...etc.

A Go library and common interface for running local and remote commands

Chore is a elegant and simple tool for executing common tasks on remote servers.

kubeaudit helps you audit your Kubernetes clusters against common security controls

CLI tool to convert many common document types to plane text.

Integrated console application library, using Go structs as commands, with menus, completions, hints, history, Vim mode, $EDITOR usage, and more ...

convert curl commands to Python, JavaScript, Go, PHP, R, Dart, Java, MATLAB, Rust, Elixir and more

Chalk is a Go Package which can be used for making terminal output more vibrant with text colors, text styles and background colors.

Buildkite-cli - Command line tool for interacting with Buildkite pipelines, builds, and more

Bk - Command line tool for interacting with Buildkite pipelines, builds, and more