Smart and simple CSV processing on the command line

csvquote

smart and simple CSV processing on the command line

Are you looking for a way to process CSV data with standard UNIX shell commands?

Are you running into problems with embedded commas and newlines that mess everything up?

Do you wish there was some way to add some CSV intelligence to these UNIX tools?

  • awk, sed
  • cut, join
  • head, tail
  • sort, uniq
  • wc, split

This program can be used at the start and end of a text processing pipeline so that regular unix command line tools can properly handle CSV data that contain commas and newlines inside quoted data fields.

Without this program, embedded special characters would be incorrectly interpretated as separators when they are inside quoted data fields.

By using csvquote, you temporarily replace the special characters inside quoted fields with harmless nonprinting characters that can be processed as data by regular text tools. At the end of processing the text, these nonprinting characters are restored to their previous values.

In short, csvquote wraps the pipeline of UNIX commands to let them work on clean data that is consistently separated, with no ambiguous special characters present inside the data fields.

By default, the program expects to use these as special characters:

" quote character  
, field delimiter  
\n record separator  

It is possible to specify different characters for the field and record separators, such as tabs or pipe symbols.

Note that the quote character can be contained inside a quoted field by repeating it twice, eg.

field1,"field2, has a comma in it","field 3 has a ""Quoted String"" in it"

Typical usage of csvquote is as part of a command line pipe, to permit the regular unix text-manipulating commands to avoid misinterpreting special characters found inside fields. eg.

csvquote foobar.csv | cut -d ',' -f 5 | sort | uniq -c | csvquote -u

or taking input from stdin,

cat foobar.csv | csvquote | cut -d ',' -f 7,4,2 | csvquote -u

other examples:

csvquote -t foobar.tsv | wc -l

csvquote -q "'" foobar.csv | sort -t, -k3 | csvquote -u

csvquote foobar.csv | awk -F, '{sum+=$3} END {print sum}'

Installation

To install using go, clone the repo and then build it:

> go build -o csvquote cmd/cvsquote/main.go
> cp ./csvquote /usr/local/bin

To install using Earthly (linux):

earthly github.com/adamgordonbell/csvquote+build
cp ./csvquote /usr/local/bin

Install on MacOS (X86)

earthly github.com/adamgordonbell/csvquote+for-darwin-amd64
cp ./csvquote /usr/local/bin

Install on Windows

earthly github.com/adamgordonbell/csvquote+for-windows-amd64
# Then add to your path

History

This is a fork of original version in C by Dan Brown found here: https://github.com/dbro/csvquote

More specifically, this is a fork of a fork of the C version. It is based on a GoLang version Dan wrote at some point. Go makes cross-compilation easier.

Owner
Adam Gordon Bell
have a podcast about software development and work on software builds
Adam Gordon Bell
Similar Resources

Go package to make lightweight ASCII line graph ╭┈╯ in command line apps with no other dependencies.

Go package to make lightweight ASCII line graph ╭┈╯ in command line apps with no other dependencies.

asciigraph Go package to make lightweight ASCII line graphs ╭┈╯. Installation go get github.com/guptarohit/asciigraph Usage Basic graph package main

Jan 8, 2023

A simple script to run speedtest(offical) CLI tool and store the results in CSV

PeriodicBW A script made to run official speedtest.net binary periodically and store the results in a CSV file Installation Get the official speedtest

Aug 10, 2021

git-xargs is a command-line tool (CLI) for making updates across multiple Github repositories with a single command.

git-xargs is a command-line tool (CLI) for making updates across multiple Github repositories with a single command.

Table of contents Introduction Reference Contributing Introduction Overview git-xargs is a command-line tool (CLI) for making updates across multiple

Dec 31, 2022

git-xargs is a command-line tool (CLI) for making updates across multiple GitHub repositories with a single command

git-xargs is a command-line tool (CLI) for making updates across multiple GitHub repositories with a single command

git-xargs is a command-line tool (CLI) for making updates across multiple GitHub repositories with a single command. You give git-xargs:

Feb 5, 2022

A command line tool for simplified docker volume command built with go

dockervol A command line tool for simplified docker volume command built with go. Features: Remove anonymous volume (beta) Remove volume by matching n

Dec 18, 2021

Simple tool to convert a Parquet file to a CSV written in Go/ Golang

Parquet2CSV Parquet2CSV is a simple tool to convert a Parquet file to a CSV written in Go/ Golang Usage: parquet2csv PATH_TO_PARQUET_FILE This will

Nov 3, 2022

Simple and complete API for building command line applications in Go

Simple and complete API for building command line applications in Go Module cli provides a simple, fast and complete API for building command line app

Nov 23, 2022

A simple, fast, and fun package for building command line apps in Go

cli cli is a simple, fast, and fun package for building command line apps in Go. The goal is to enable developers to write fast and distributable comm

Dec 31, 2022

Simple and easy to use command line application written in Go for cleaning unnecessary XCode files.

xcclear Say hello to a few extra gigabytes of space on your Mac with xcclear, a simple and easy to use command line application written in Go for clea

Dec 16, 2022
Comments
  • Documentation could benefit from a few more examples.

    Documentation could benefit from a few more examples.

    Documentation does not let me conclude whether or not csvquote is the utility for the task I have in mind.

    How to use csvquote (and awk) to filter the rows of a CSV file where column "p_value" has values under 0.05?

    Users would appreciate that as it is such a common task.

    Thank you.

  • Remove quotation marks?

    Remove quotation marks?

    Hello! I love this tool! I thought of a small potential improvement:

    If the intention is that I am supposed to be able to work with CSV fields in awk, sed etc. one problem I'm hitting is that I need to deal with the actual quotation marks wrapping the columns. Would it be possible to skip the quotation marks around the fields and then readd them with csvquote -u (if needed?)?

    A problem/usecase: I have a CSV file containing a file of financial transactions. It looks likes this:

    date,description,amount
    "2022-06-02","McDonalds","22"
    "2022-06-02","Burger King","42"
    

    Unfortunately I'm not in control of the format of this file - that is I can't control the quotation. I would like to negate the amounts. My initial take was to do this with cat file.csv | csvquote | awk '{$3=-$3;print}' | csvquote -u. However, this doesn't work as $3 in awk is "22", not 22.

    Workaround: I can of course do gsub(/"/, "", $3);$3=-$3;$3="\"" $3 "\""; or something but it's pretty hacky. I could also of pipe my file through csvformat which I think should use minimal quoting, but it would be really nice to not have to do that.

    Proposal: Remove the output quotations because I think they aren't needed. Readd them when doing csvquote -u again (if needed?).

    Does this make sense?

A simple command line for convert CSV in JSON
A simple command line for convert CSV in JSON

C2J A simple command line for convert CSV in JSON list of objects based on header. Install With Go 1.17 or higher: go install github.com/edermanoel94/

Dec 14, 2022
A simple command line functionality to convert your Kaspersky Password Manager exported file to CSV format

A simple command line functionality to convert your Kaspersky Password Manager exported file to CSV format

Apr 20, 2022
Query, update and convert data structures from the command line. Comparable to jq/yq but supports JSON, TOML, YAML, XML and CSV with zero runtime dependencies.
Query, update and convert data structures from the command line. Comparable to jq/yq but supports JSON, TOML, YAML, XML and CSV with zero runtime dependencies.

dasel Dasel (short for data-selector) allows you to query and modify data structures using selector strings. Comparable to jq / yq, but supports JSON,

Jan 2, 2023
Command-line tool to load csv and excel (xlsx) files and run sql commands
Command-line tool to load csv and excel (xlsx) files and run sql commands

csv-sql supports loading and saving results as CSV and XLSX files with data processing with SQLite compatible sql commands including joins.

Nov 2, 2022
A command line tool that builds and (re)starts your web application everytime you save a Go or template fileA command line tool that builds and (re)starts your web application everytime you save a Go or template file

# Fresh Fresh is a command line tool that builds and (re)starts your web application everytime you save a Go or template file. If the web framework yo

Nov 22, 2021
From the command line, quickly explore data from a CSV file.
From the command line, quickly explore data from a CSV file.

shallow-explore From the command line, quickly explore data from a CSV file. shallow-explore is a Golang backed command-line tool for iterating over c

Nov 10, 2022
An open-source GitLab command line tool bringing GitLab's cool features to your command line
An open-source GitLab command line tool bringing GitLab's cool features to your command line

GLab is an open source GitLab CLI tool bringing GitLab to your terminal next to where you are already working with git and your code without switching

Dec 30, 2022
A command line tool to prompt for a value to be included in another command line.

readval is a command line tool which is designed for one specific purpose—to prompt for a value to be included in another command line. readval prints

Dec 22, 2021
Watcher - A simple command line app to watch files in a directory for changes and run a command when files change!

Watcher - Develop your programs easily Watcher watches all the files present in the directory it is run from of the directory that is specified while

Mar 27, 2022
Package command provide simple API to create modern command-line interface

Package command Package command provide simple API to create modern command-line interface, mainly for lightweight usage, inspired by cobra Usage pack

Jan 16, 2022