Ripgrep but for gzip-compressed files over http

Juicer

It's ripgrep but for Gzip-compressed files over HTTP!

This tool was primarily designed to scan thru the Common Crawl dataset for URLs without spending a fortune on AWS.

Features:

  • Extremely fast regex engine (Intel Hyperscan)
  • Scan thru terabytes of data without writing them to disk
  • Concurrent scanning of multiple files

TODO:

  • Client/server for handing out scanning tasks
  • Zstandard support? (for IA WARCs)
Owner
Boom
I like cute anime girls
Boom
Similar Resources

An implementation of the FileSystem interface for tar files.

TarFS A wrapper around tar.Reader. Implements the FileSystem interface for tar files. Adds an Open method, that enables reading of file according to i

Sep 26, 2022

QueryCSV enables you to load CSV files and manipulate them using SQL queries then after you finish you can export the new values to a CSV file

QueryCSV enables you to load CSV files and manipulate them using SQL queries then after you finish you can export the new values to a CSV file

QueryCSV enable you to load CSV files and manipulate them using SQL queries then after you finish you can export the new values to CSV file

Dec 22, 2021

A Go io/fs filesystem implementation for reading files in a Github gists.

GistFS GistFS is an io/fs implementation that enables to read files stored in a given Gist. Requirements This module depends on io/fs which is only av

Oct 14, 2022

Create ePub files from URLs

url2epub Create ePub files from URLs Overview The root directory provides a Go library that creates ePub files out of URLs, with limitations.

Nov 5, 2022

🏵 Gee is tool of stdin to each files and stdout

🏵 Gee is tool of stdin to each files and stdout

Gee is tool of stdin to each files and stdout. It is similar to the tee command, but there are more functions for convenience. In addition, it was written as go. which provides output to stdout and files.

Nov 17, 2022

Golang PDF library for creating and processing PDF files (pure go)

UniPDF - PDF for Go UniDoc UniPDF is a PDF library for Go (golang) with capabilities for creating and reading, processing PDF files. The library is wr

Dec 28, 2022

Format /etc/fstab files.

Format /etc/fstab files.

Format /etc/fstab files. Features and limitations Can format /etc/fstab files. Will use 2 spaces between all fields, if they are of equal length. The

Dec 3, 2022

Run a command when files change

Reflex Reflex is a small tool to watch a directory and rerun a command when certain files change. It's great for automatically running compile/lint/te

Dec 29, 2022

Go (golang) library for reading and writing XLSX files.

XLSX Introduction xlsx is a library to simplify reading and writing the XML format used by recent version of Microsoft Excel in Go programs. Current s

Jan 5, 2023
Related tags
Easily create Go files from stub files

go-stubs Easily create .go files from stub files in your projects. Usage go get github.com/nwby/go-stubs Create a stub file: package stubs type {{.Mo

Jan 27, 2022
app-services-go-linter plugin analyze source tree of Go files and validates the availability of i18n strings in *.toml files

app-services-go-linter app-services-go-linter plugin analyze source tree of Go files and validates the availability of i18n strings in *.toml files. A

Nov 29, 2021
Simple but powerful manager for your dotfiles
Simple but powerful manager for your dotfiles

Dotman The dotfile manager you are searching for Version v0.3 [Next] Installer scripts Bug fixes v0.2 [Now] Automatic git support added v0.1 Initial v

Dec 16, 2022
A virtual file system for small to medium sized datasets (MB or GB, not TB or PB). Like Docker, but for data.

AetherFS assists in the production, distribution, and replication of embedded databases and in-memory datasets. You can think of it like Docker, but f

Feb 9, 2022
searchHIBP is a golang tool that implements binary search over a hash ordered binary file.

searchHIBP is a golang tool that implements binary search over a hash ordered binary file.

Nov 9, 2021
Compute message digest for large files in Go

checksum test coverage Compute message digest, like MD5 and SHA256, in golang for potentially large files. Usage package main import ( "fmt" "githu

Dec 28, 2022
copy files for humans

Go-Decent-Copy go-decent-copy provides a copy file for humans Usage package main import "github.com/hugocarreira/go-decent-copy" func main() { e

Sep 26, 2022
Golang wrapper for Exiftool : extract as much metadata as possible (EXIF, ...) from files (pictures, pdf, office documents, ...)

go-exiftool go-exiftool is a golang library that wraps ExifTool. ExifTool's purpose is to extract as much metadata as possible (EXIF, IPTC, XMP, GPS,

Dec 28, 2022
Load GTFS files in golang

go-gtfs Load GTFS files in Go. The project is in maintenance mode. It is kept compatible with changes in the Go ecosystem but no new features will be

Dec 5, 2022