A Go native tabular data extraction package. Currently supports .xls, .xlsx, .csv, .tsv formats.

grate

A Go native tabular data extraction package. Currently supports .xls, .xlsx, .csv, .tsv formats.

Why?

Grate focuses on speed and stability first, and makes no attempt to parse charts, figures, or other content types that may be present embedded within the input files. It tries to perform as few allocations as possible and errs on the side of caution.

There are certainly still some bugs and edge cases, but we have run it successfully on a set of 400k .xls and .xlsx files to catch many bugs and error conditions. Please file an issue with any feedback and additional problem files.

Usage

Grate provides a simple standard interface for all supported filetypes, allowing access to both named worksheets in spreadsheets and single tables in plaintext formats.

package main

import (
    "fmt"
    "os"
    "strings"

    "github.com/pbnjay/grate"
    _ "github.com/pbnjay/grate/simple" // tsv and csv support
    _ "github.com/pbnjay/grate/xls"
    _ "github.com/pbnjay/grate/xlsx"
)

func main() {
    wb, _ := grate.Open(os.Args[1])  // open the file
    sheets, _ := wb.List()           // list available sheets
    for _, s := range sheets {       // enumerate each sheet name
        sheet, _ := wb.Get(s)        // open the sheet
        for sheet.Next() {           // enumerate each row of data
            row := sheet.Strings()   // get the row's content as []string
            fmt.Println(strings.Join(row, "\t"))
        }
    }
    wb.Close()
}

License

All source code is licensed under the GNU GPLv3.

Similar Resources

Extraction politique de conformité : xlsx (fichier de suivi) - xml (format AlgoSec)

go_policyExtractor Extraction politique de conformité : xlsx (fichier de suivi) - xml (format AlgoSec). Le programme suivant se base sur les intitulé

Nov 4, 2021

Fast, realtime regex-extraction, and aggregation into common formats such as histograms, numerical summaries, tables, and more!

Fast, realtime regex-extraction, and aggregation into common formats such as histograms, numerical summaries, tables, and more!

rare A file scanner/regex extractor and realtime summarizor. Supports various CLI-based graphing and metric formats (histogram, table, etc). Features

Dec 29, 2022

Dumpling is a fast, easy-to-use tool written by Go for dumping data from the database(MySQL, TiDB...) to local/cloud(S3, GCP...) in multifarious formats(SQL, CSV...).

🥟 Dumpling Dumpling is a tool and a Go library for creating SQL dump from a MySQL-compatible database. It is intended to replace mysqldump and mydump

Nov 9, 2022

sq is a command line tool that provides jq-style access to structured data sources such as SQL databases, or document formats like CSV or Excel.

sq: swiss-army knife for data sq is a command line tool that provides jq-style access to structured data sources such as SQL databases, or document fo

Jan 1, 2023

Package go-otp implements one-time-password generators used in 2-factor authentication systems like RSA-tokens. Currently this supports both HOTP (RFC-4226), TOTP (RFC-6238) and Base32 encoding (RFC-3548) for Google Authenticator compatibility

OTP Package go-otp implements one-time-password generators used in 2-factor authentication systems like RSA-tokens and Google Authenticator. Currently

Oct 8, 2022

Command-line tool to load csv and excel (xlsx) files and run sql commands

Command-line tool to load csv and excel (xlsx) files and run sql commands

csv-sql supports loading and saving results as CSV and XLSX files with data processing with SQLite compatible sql commands including joins.

Nov 2, 2022

datatable is a Go package to manipulate tabular data, like an excel spreadsheet.

datatable is a Go package to manipulate tabular data, like an excel spreadsheet.

datatable is a Go package to manipulate tabular data, like an excel spreadsheet. datatable is inspired by the pandas python package and the data.frame R structure. Although it's production ready, be aware that we're still working on API improvements

Nov 23, 2022

:triangular_ruler:gofmtmd formats go source code block in Markdown. detects fenced code & formats code using gofmt.

:triangular_ruler:gofmtmd formats go source code block in Markdown. detects fenced code & formats code using gofmt.

gofmtmd gofmtmd formats go source code block in Markdown. detects fenced code & formats code using gofmt. Installation $ go get github.com/po3rin/gofm

Oct 31, 2022

Formats discord tokens to different formats.

Formats discord tokens to different formats.

token_formatter Formats discord tokens to different formats. Features Format your current tokens to a new format! Every tool uses a different format f

Nov 3, 2022

Data visualization with chart, Create CSV file, Read Write CSV file

Data visualization with chart, Create CSV file, Read Write CSV file, Read from json file and many more in single project ......

Jan 13, 2022

Query, update and convert data structures from the command line. Comparable to jq/yq but supports JSON, TOML, YAML, XML and CSV with zero runtime dependencies.

Query, update and convert data structures from the command line. Comparable to jq/yq but supports JSON, TOML, YAML, XML and CSV with zero runtime dependencies.

dasel Dasel (short for data-selector) allows you to query and modify data structures using selector strings. Comparable to jq / yq, but supports JSON,

Jan 2, 2023

Query, update and convert data structures from the command line. Comparable to jq/yq but supports JSON, TOML, YAML, XML and CSV with zero runtime dependencies.

Query, update and convert data structures from the command line. Comparable to jq/yq but supports JSON, TOML, YAML, XML and CSV with zero runtime dependencies.

dasel Dasel (short for data-selector) allows you to query and modify data structures using selector strings. Comparable to jq / yq, but supports JSON,

Jan 2, 2023

Converts a trace of Datadog to a sequence diagram of PlantUML (Currently, supports only gRPC)

Converts a trace of Datadog to a sequence diagram of PlantUML (Currently, supports only gRPC)

jigsaw Automatically generate a sequence diagram from JSON of Trace in Datadog. ⚠️ Only gRPC calls appear in the sequence diagram. Example w/ response

Jul 12, 2022

Scalable golang ratelimiter using the sliding window algorithm. Currently supports only Redis.

Scalable golang ratelimiter using the sliding window algorithm. Currently supports only Redis.

go-ratelimiter Scalable golang ratelimiter using the sliding window algorithm. Currently supports only Redis. Example usage client := redis.NewClient

Oct 19, 2021

Beerus-DB: a database operation framework, currently only supports Mysql, Use [go-sql-driver/mysql] to do database connection and basic operations

Beerus-DB · Beerus-DB is a database operation framework, currently only supports Mysql, Use [go-sql-driver/mysql] to do database connection and basic

Oct 29, 2022

Aws-console-plugin - The current HashiCorp Vault AWS Secret Engine currently supports the creation of short lived API keys using the IAM User

aws-console-plugin Background The current HashiCorp Vault AWS Secret Engine curr

Feb 7, 2022

converts text-formats from one to another, it is very useful if you want to re-format a json file to yaml, toml to yaml, csv to yaml, ... etc

re-txt reformates a text file from a structure to another, i.e: convert from json to yaml, toml to json, ... etc Supported Source Formats json yaml hc

Sep 23, 2022

CLI tool that can execute SQL queries on CSV, LTSV, JSON and TBLN. Can output to various formats.

CLI tool that can execute SQL queries on CSV, LTSV, JSON and TBLN. Can output to various formats.

trdsql CLI tool that can execute SQL queries on CSV, LTSV, JSON and TBLN. It is a tool like q, textql and others. The difference from these tools is t

Jan 1, 2023

A go library to improve readability in terminal apps using tabular data

uitable uitable is a go library for representing data as tables for terminal applications. It provides primitives for sizing and wrapping columns to i

Dec 30, 2022
Comments
  • Date column prints as days from epoch

    Date column prints as days from epoch

    grater xlsm_date_hataly_hatar.xlsm prints

    F_MODKOD        F_TIPUS F_ERTEK F_HATALY        F_HATAR F_TERITO
    11622   E       4.5     43983   44347   T
    F_MODKOD        F_DIJFIZGYAK    F_DIJFIZMOD     F_ERTEK F_HATALY        F_HATAR F
    13101   E       C       496     43983   44317   T
    F_MODKOD        F_TARTAMTOL     F_TARTAMIG      F_KEZD_MULT     F_NYK_MULT      F_EXTRA_MULT    F_BEF_MULT      F_BEF_MULT2     F_HATALY        F_HATALYIG      F_FL    F_MINIMALIS_POOL        F_INIT_KOCK_MULT        F_LAST_KOCK_MULT      F_RESZVISSZA_KTSG
    13103   1       99      50      3       3       0.12    0.08    43983   44347   0       7144    1       1       1744
    F_MODK  F_TAGSZAM       F_EVESDIJ       F_HATALYTOL     F_HATALYIG
    12410   1       1111    43983   44347
    F_HATALY        F_MODKOD        F_BEKOD F_BOSSZEG 2014  F_TERITO        F_SZAZTOL       F_SZAZIG
    43983   12410   E31001  123456  T       100     100
    

    Here, "F_HATALY", "F_HATAR", "F_TARTAMTOL", "F_TARTAMIG", "F_HATALYTOL", "F_HATALYIG" columns are dates.

    xlsm_date_hataly_hatar.xlsm.gz

  • xls reads

    xls reads "0" on rows with many integer values

    • Attached is testing.xls test case
    • testing.tsv (filetype not supported by github) was created by copying testing.xls data into testing.tsv file
    • grate/xls/simple_test.go TestBasic was edited to use testing.xls and testing.tsv and to log all mismatches
    
    func TestBasic(t *testing.T) {
    	trueFile, err := os.ReadFile("../testdata/testing.tsv")
    	if err != nil {
    		t.Skip()
    	}
    	lines := strings.Split(string(trueFile), "\n")
    
    	fn := "../testdata/testing.xls"
    	wb, err := Open(fn)
    	if err != nil {
    		t.Fatal(err)
    	}
    
    	sheets, err := wb.List()
    	if err != nil {
    		t.Fatal(err)
    	}
    	for _, s := range sheets {
    		sheet, err := wb.Get(s)
    		if err != nil {
    			t.Fatal(err)
    		}
    
    
    		i := 0
    		for sheet.Next() {
    			row := strings.Join(sheet.Strings(), "\t")
    			if lines[i] != row {
    				t.Logf("line %d mismatch: '%s' <> '%s'", i, row, lines[i])
    			}
    			i++
    		}
    	}
    
    	err = wb.Close()
    	if err != nil {
    		t.Fatal(err)
    	}
    }
    

    ` --- FAIL: TestBasic (0.00s)

    /Users/zeke/Programming/grate/xls/simple_test.go:71: line 2 mismatch: 'b	0	0	0' <> 'b	2	3	4'
    
    /Users/zeke/Programming/grate/xls/simple_test.go:71: line 4 mismatch: 'b	0	0	0' <> 'b	1	2	1'
    
    /Users/zeke/Programming/grate/xls/simple_test.go:71: line 5 mismatch: 'b	0	0	0' <> 'b	4	3	2'
    
    /Users/zeke/Programming/grate/xls/simple_test.go:71: line 6 mismatch: '0	0	0	0' <> '1	1	1   1'`
    

    testing.xls

  • date formatting weekdays

    date formatting weekdays

    Needs some backtracking in makeFormatter, currently "dddd" becomes "Sunday" but then "d" is applied to become "Sun11ay"

Golang bindings for libxlsxwriter for writing XLSX files
Golang bindings for libxlsxwriter for writing XLSX files

goxlsxwriter provides Go bindings for the libxlsxwriter C library. Install goxlsxwriter requires the libxslxwriter library to be installe

Nov 18, 2022
Golang library for reading and writing Microsoft Excel™ (XLSX) files.
Golang library for reading and writing Microsoft Excel™ (XLSX) files.

Excelize Introduction Excelize is a library written in pure Go providing a set of functions that allow you to write to and read from XLSX / XLSM / XLT

Jan 5, 2023
Golang bindings for libxlsxwriter for writing XLSX files
Golang bindings for libxlsxwriter for writing XLSX files

goxlsxwriter goxlsxwriter provides Go bindings for the libxlsxwriter C library. Install goxlsxwriter requires the libxslxwriter library to be installe

May 30, 2021
Go (golang) library for reading and writing XLSX files.

XLSX Introduction xlsx is a library to simplify reading and writing the XML format used by recent version of Microsoft Excel in Go programs. Tutorial

Dec 28, 2022
Fast and reliable way to work with Microsoft Excel™ [xlsx] files in Golang

Xlsx2Go package main import ( "github.com/plandem/xlsx" "github.com/plandem/xlsx/format/conditional" "github.com/plandem/xlsx/format/conditional/r

Dec 17, 2022
Pure go library for creating and processing Office Word (.docx), Excel (.xlsx) and Powerpoint (.pptx) documents
Pure go library for creating and processing Office Word (.docx), Excel (.xlsx) and Powerpoint (.pptx) documents

unioffice is a library for creation of Office Open XML documents (.docx, .xlsx and .pptx). Its goal is to be the most compatible and highest performan

Jan 4, 2023
go-eexcel implements encoding and decoding of XLSX like encoding/json

go-eexcel go-eexcel implements encoding and decoding of XLSX like encoding/json Usage func ExampleMarshal() { type st struct { Name string `eexce

Dec 9, 2021
A simple excel engine without ui to parse .csv files.

A simple excel engine without ui to parse .csv files.

Nov 4, 2021
Fastq demultiplexer for single cell data from MGI sequencer (10x converted library).

fastq_demultiplexer Converts fastq single cell data from MGI (10x converted library) to Illumina compatible format. Installation go install github.com

Nov 24, 2021
Miller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON
Miller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON

What is Miller? Miller is like awk, sed, cut, join, and sort for data formats such as CSV, TSV, JSON, JSON Lines, and positionally-indexed. What can M

Jan 5, 2023