Tool, language and decoders for inspecting binary data.

fq

Tool, language and decoders for inspecting binary data.

fq demo

In most cases fq works the same way as jq but instead of reading JSON it reads binary data. The result is a JSON compatible structures where each value has a bit range, symbolic interpretations and know how to be presented in a useful way.

NOTE: fq is early in development and many things are missing, broken or do not make sense. That also means there is a great opportunity to help out!

Goals

  • Make binary formats accessible and queryable.
  • Nested formats and bit-oriented decoding.
  • Quick and comfortable CLI tool.
  • Bit and byte transformations and conversions.
  • Programmer's calculator.

Usage

Basic usage is fq . file.

For details see usage.md

Install

Download release for your platform, unarchive it and move the executable to PATH etc.

Homebrew

# install latest release
brew install wader/tap/fq

Build from source

Make sure you have go 1.17 or later installed.

To install directly from git repository do:

# build and install latest release
go install github.com/wader/fq@latest

# or build and install latest master
go install github.com/wader/fq@master

# copy binary to $PATH if needed
cp "$(go env GOPATH)/bin/fq" /usr/local/bin

To build and run tests from source directory:

make test fq
# copy binary to $PATH if needed
cp fq /usr/local/bin

Supported formats

aac_frame, adts, adts_frame, apev2, av1_ccr, av1_frame, av1_obu, avc_annexb, avc_au, avc_dcr, avc_nalu, avc_pps, avc_sei, avc_sps, bzip2, dns, dns_tcp, elf, ether8023_frame, exif, flac, flac_frame, flac_metadatablock, flac_metadatablocks, flac_picture, flac_streaminfo, gif, gzip, hevc_annexb, hevc_au, hevc_dcr, hevc_nalu, icc_profile, icmp, id3v1, id3v11, id3v2, ipv4_packet, jpeg, json, matroska, mp3, mp3_frame, mp4, mpeg_asc, mpeg_es, mpeg_pes, mpeg_pes_packet, mpeg_spu, mpeg_ts, ogg, ogg_page, opus_packet, pcap, pcapng, png, protobuf, protobuf_widevine, pssh_playready, raw, sll2_packet, sll_packet, tar, tcp_segment, tiff, udp_datagram, vorbis_comment, vorbis_packet, vp8_frame, vp9_cfm, vp9_frame, vpx_ccr, wav, webp, xing, zip

For details see formats.md

TODO and ideas

See TODO.md

Development

See dev.md

Thanks and related projects

This project would not have been possible without itchyny's jq implementation gojq. I also want to thank HexFiend for inspiration and ideas and stedolan for inventing the jq language.

Similar or related projects:

Owner
Mattias Wadman
💻🦫
Mattias Wadman
Comments
  • Add WASM format

    Add WASM format

    Hello, I have created this because I personally needed it. Tested against bunch of .wasm / .fqtest pairs in format/wasm/testdata/core. (The .wasm files are generated from the standard test suite i.e. github.com/WebAssembly/spec/test/core.)

    I would be happy to receive any feedback.

    Thank you!

  • macho: Initial impl for macho support.

    macho: Initial impl for macho support.

    [WIP] Several issues remain for symbol mapping for flags and constant fields UNIX/THREAD LC .state field has to be visualized for PPC/64, ARM/64 X86-64

    I have created a working barebones implementation there are several improvements to be made here: We need to have a SymStr mapping for cpusubtype but it depends on cputype and an arbitrary number. Flag fields for individual messages are in uint32_t, I think it is user friendly to have those expanded to see which flags are set or not. LC_THREAD/LC_UNIXTHREAD commands have a map of Register Name to Register Value. The visualization for this must be implemented for major architectures as I have written in the commit message

    It correctly parses the fq binary on MacOS.

    Solves #43

  • Apple bookmarkData

    Apple bookmarkData

    This PR implements a decoder for macOS/iOS bookmarkData blobs, which are often found within binary plist files. These are used to resolve URL objects for a file, even if the user moves or renames it.

    There are some issues that I could use help ironing out, namely getting it to work properly as a nested format using grep_by or select in recursive jq expressions. I think there may be something wrong with the way torepr and tobytes are working in the bplist decoder. For instance this does not work in the way that I would think it would (does not yield any results):

    fq '.. | select(format=="bookmark") | .map(. | torepr)' com.apple.LSSharedFileList.RecentApplications.sfl2
    

    In this case, when a bplist data type is passed to tobytes, the bytes output are the truncated base64 representation:

    ./fq 'torepr."$objects"[15] | tobytes | bookmark' ~/Library/Application\ Support/com.apple.sharedfilelist/com.apple.LSSharedFileList.RecentApplications.sfl2
    
  • Avro: Add decoder

    Avro: Add decoder

    Full avro OCF support. Handles all primitive, complex, and logical types besides decimals.

    Able to handle deflate, snappy, and null codecs for blocks.

    Requirements for WIP removal:

    • ~Support common logical types (date, decimal, duration, time, timestamp)~
    • ~Add test case with all avro datatypes~
    • ~Evaluate viability of splitting avro datum into a subdecoder~
    • ~Cleanup OCF header decoding~
    • ~Cleanup schema decoding~
    • ~Finalize design around handling OCF codecs. (Currently only handles null codec, rest treat the datum as raw bytes and won't decode them)~
  • Adds support for Apple Binary Plist, version 00

    Adds support for Apple Binary Plist, version 00

    This adds support for decoding Apple Binary Plists. The only well documented version is 00, and is therefore the only one supported here. I have tested this on both large and small binary plists, including ones with nested dictionaries.

  • bplist: NSKeyedArchiver jq function

    bplist: NSKeyedArchiver jq function

    NSKeyedArchiver stores objects in a bplist format by flattening the object into a set of keys and values, which reference each other by index. A common example of these are the sfl2 files located in ~/Library/Application Support/com.apple.sharedfilelist. @wader proposed the following function for reconstructing these objects into a more meaningful JSON representation:

    def from_ns_keyed_archiver:
      (  . as {"$objects": $objs, "$top": {root: $root_uid}}
      | def _f($id):
          ( . #debug({$id})
          | $objs[$id]
          # | debug
          | if type == "string" then .
            elif type == "number" then .
            else
              (. as {"$class": $class}
              | if $class == 13 then # NSDictionary?
                  ( . as {"NS.keys": $ns_keys, "NS.objects": $ns_objects}
                  | [$ns_keys, $ns_objects]
                  | transpose
                  | map(
                      ( . as [$k, $o]
                      | {key: _f($k), value: _f($o)}
                      )
                    )
                  # | debug
                  | from_entries
                  )
                elif $class == 58 then #?
                  ( . as {"NS.objects": $ns_objects}
                  | $ns_objects
                  | map(_f(.))
                  )
                else "class-\($class)"
                end
              )
            end
          );
        _f($root_uid)
      );
    

    However, it was found that the class numbers are not consistent across multiple files, so relying on them for interpreting underlying types is not a general solution. The following seems to work:

    def from_ns_keyed_archiver:
      (  . as {"$objects": $objs, "$top": {root: $root_uid}}
      | def _f($id):
          ( . #| debug({$id})
          | $objs[$id]
          #| debug
          | if type == "string" then .
            elif type == "number" then .
            else
              (. as {"$class": $class}
              | . #debug
              | if ."NS.keys" != null and ."NS.objects" != null then
                  ( . as {"NS.keys": $ns_keys, "NS.objects": $ns_objects}
                  | [$ns_keys, $ns_objects]
                  | transpose
                  | map
                    (
                      ( . as [$k, $o]
                      | {key: _f($k), value: _f($o)}
                      )
                    )
                  | from_entries
                  )
                elif ."NS.objects" != null then
                  ( . as {"NS.objects": $ns_objects}
                  | $ns_objects
                  | map(_f(.))
                  )
                else "class-\($class)"
                end
              )
            end
          );
        _f($root_uid)
      );
    

    However, we are not yet sure that this is a best practice since it is was created from a heuristic approach that is not based on any known reference documentation. More work is needed to identify the best way of identifying arrays and objects within NSKeyedArchiver representations.

  • Run expect script on windows CI

    Run expect script on windows CI

    CI env is https://github.com/actions/virtual-environments/blob/main/images/win/Windows2022-Readme.md

    The expect test is done from here https://github.com/wader/fq/blob/master/Makefile#L24 using https://github.com/wader/fq/blob/master/pkg/cli/test.sh and https://github.com/wader/fq/blob/master/pkg/cli/test.exp. The shell script is helper around the expect script to be able to pass a fq binary to test and also to be silent on success and print log if not.

    The windows image has make and bash so hopefully it is mostly about installing expect and then checking if expect.exe is in path?

  • doc: Add Nix instructions to readme

    doc: Add Nix instructions to readme

    fq is packaged in Nixpkgs as of https://github.com/NixOS/nixpkgs/pull/151871, so these instructions will work within a few days (once the master channel advances.)

  • Incorrect version in some distribution packages

    Incorrect version in some distribution packages

    $ docker run --rm alpine:3.15 sh -c 'apk add -X http://dl-cdn.alpinelinux.org/alpine/edge/testing fq && fq --version'
    fetch http://dl-cdn.alpinelinux.org/alpine/edge/testing/x86_64/APKINDEX.tar.gz
    fetch https://dl-cdn.alpinelinux.org/alpine/v3.15/main/x86_64/APKINDEX.tar.gz
    fetch https://dl-cdn.alpinelinux.org/alpine/v3.15/community/x86_64/APKINDEX.tar.gz
    (1/1) Installing fq (0.0.3-r0)
    Executing busybox-1.34.1-r3.trigger
    OK: 15 MiB in 15 packages
    dev
    

    Currently version is set by goreleaser/make when building using go build ldflag -X main.version=.... To make it work with distributions that builds manually using go build etc we probably have to include the version in the source somehow.

    • Create some version file using goreleaser when packaging source archives. Wont fix issue if distribution clones the repo.
    • Convince package maintainers to build using make and pass version as some argument? too much hassel i think
    • Manually change/commit some version file before release
    • Some other way?
  • mp4: Properly use trun data offset

    mp4: Properly use trun data offset

    Each trun has it's own data offset, before the last offset was wrongly used for all truns. Could also cause sample ranges to be beyond EOF.

    tenc: Decode default constant iv

    Fixes #292

  • Elf parsed as mp3

    Elf parsed as mp3

    Is that expected to see such result of parsing elf (/bin/ls) as mp3?

    $ file /bin/ls
    /bin/ls: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=2f15ad836be3339dec0e2e6a3c637e08e48aacbd, for GNU/Linux 3.2.0, stripped
    $ fq d /bin/ls
    

    image

  • cvs: Add more from/to options

    cvs: Add more from/to options

    Add header option, on by default Disable comments by default Rename comma to delimiter Add quote_char option Add skip initial space option

    Uses a forked version of std csv to support custom quote character

    See top of csv.go for TODOs

  • JSON to maintain array on predefined objects

    JSON to maintain array on predefined objects

    @wader , Opening discussion at right forum now.

    Could you please provide a combined command to do both tasks at once

    • I would like to only specify few objects to be an array like ".aitem, .oitem, .sitem" but not all the JSON.
    • convert all numbers/floats/booleans without double quotes as not a string.
    <root>
        <aitem>
            <name>abc</name>
            <value>123</value>
        </aitem>
        <bitem>
            <name>bbbb</name>
            <value>2222</value>
        </bitem>
        <bitem>
            <name>BB</name>
            <value>22</value>
        </bitem>
    </root>
    
    

    Expected Output , Only specified like .aitem, .oitem, .sitem must become array but not entire JSON & All numbers/floats/booleans are as non string without double quoted.

    {
      "root": {
        "aitem": [
        {
          "name": "abc",
          "value": 123
        }
        ],
        "bitem": [
          {
            "name": "bbbb",
            "value": 2222
          },
          {
            "name": "BB",
            "value": true
          }
        ]
      }
     }
    
  • [FORMAT] Analog Captions & DTVCC Specs

    [FORMAT] Analog Captions & DTVCC Specs

    maybe open a new issue if you want to dump some specs and ideas.

    The wikipedia article about it seems quite good https://en.wikipedia.org/wiki/EIA-608 but would be nice to get hands on the spec.

    Line 21 Analog Captions (EIA-608)

    Spec for Analog Line 21 Captions is available, free to everyone, from https://shop.cta.tech/products/line-21-data-services. EIA-608 was originally designed for line-21 analog captions, so the spec covers both the data format and control codes and how to transmit them as analog. The transmission in line 21 analog is irrelevant in a digital world, but the control codes and decoding part is still referenced by its successor, DTVCC (EIA-708).

    However, there is also a great Web1.0 Page, which covers the mystery 608 control codes, data channels, fields and 7 bit and includes lookup tables for the two-byte words. The control codes are typically marked up in human readable as {ENM}, {EOC}, {EDC}, as these are the abbreviations used in the EIA-608 spec. The McPoodle tools are written in circa-2005 perl. You will have seen references to SCC captions, which is basically just the file-based representation of 608 data. When caption tools disassemble the 608/SCC data, they are usually represented in these human readable closed caption disassembly (CCD) codes, although each captioning and conversion tool has their own proprietary format of CCD file, but the codes contained within like {ENM}, {EOC}, {EDC} are usually represented in the same markup. This reverse-engineered documentation is what everyone used before the EIA-608 spec became freely available.

    • http://www.theneitherworld.com/mcpoodle/SCC_TOOLS/DOCS/CC_CODES.HTML
    • http://www.theneitherworld.com/mcpoodle/SCC_TOOLS/DOCS/CC_CHARS.HTML

    Digital TV Closed Captions (DTVCC), aka EIA-708.

    Spec for DTVCC (EIA-708) is available, free to everyone, from https://shop.cta.tech/products/digital-television-dtv-closed-captioning. The DTVCC spec then defines both the transmission format, ccdata() and a new encoding format called 708 (Service 1/2), but the same spec allows cross-references EIA-608 for CC1/2/3/4 compatibility and explains how EIA-608 is encoded in ccdata(), which is exactly what we have been playing with.

    You'll have to register with CTA to "buy it for $0", but all the other standards shops charge for it. If you get it from the CTA (previously EIA/CEA), both are free. Historically, they were chargeable. Since it is copyrighted material, I have not attached these to your repo, not out of laziness, but out of respect.

    Feel free to close this ticket.

  • Link file formats to existing format registries

    Link file formats to existing format registries

    @wader wrote:

    👍 I wonder if some formats in fq could have links to these projects? currently format refrences are kept in <format-name>.md ex https://github.com/wader/fq/blob/master/format/mp4/mp4.md#references

    I think the best registries of file formats to link to are

    Both can be edited freely and both link to other registries such as PRONOM. Wikidata further contains more structured information such as default file extension and MIME Type. I once wrote a Wikidata module for jq this may help to get information from Wikidata (e.g. MP4).

GoLang - Produces a binary suitable for use in shell scripts and cron jobs for rotating IAM credentials.

AWS-Rotate-IAM-Key aws-rotate-iam-key makes it easy to rotate your IAM keys whether they be in your ~/.aws/credentials file or else where. This work i

Feb 9, 2022
Analyze the binary outputted by `go build` to get type information etc.

Analyze the binary outputted by go build to get type information etc.

Oct 5, 2022
An experiment building a custom binary protocol for a calculator

Overview Implementation of a calculator service built on a custom protocol on top of TCP Details The server is in main.go, and the client is in client

Nov 28, 2021
Transfer decimal ipv4 to binary ipv4

transfer decimal ipv4 to binary ipv4. Download: git clone https://github.com/Bet

Jun 8, 2022
Exercise for solve problem data processing, performance and something wrong in passing data

Citcall Exercise Exercise for solve problem data processing, performance and something wrong in passing data Pengolahan data data processing - Readme

Nov 25, 2021
CodePlayground is a playground tool for go and rust language.

CodePlayground CodePlayground is a playground tool for go and rust language. Installation Use homebrews to install code-playground. brew tap trendyol/

Mar 5, 2022
dropspy is a (POC-quality) reworking of the C-language dropwatch tool in Go, with some extra features.

dropspy is a (POC-quality) reworking of the C-language dropwatch tool in Go, with some extra features.

Dec 12, 2022
Buffer Compactor is a tool to allow for buffering for a duration and compacting data on keys.

Buffer Compactor is a tool to allow for buffering for a duration and compacting data on keys. It uses a badgerDB and sortedset in order to coridinate a time-delayed queue that also aggregates updates sharing the same key in a extremely peformant manner.

Feb 8, 2022
Generic mapStringInterface tool for extracting of data for CSV output

Generic mapStringInterface tool for extracting of data for CSV output

Nov 2, 2021
A comphrehensive tool for converting between data table formats

tblconv A simple tool for converting one "table" format into another. Supported Formats CSV source output Excel source output SQL source output CLI Fe

May 30, 2022
The new home of the CUE language! Validate and define text-based and dynamic configuration

The CUE Data Constraint Language Configure, Unify, Execute CUE is an open source data constraint language which aims to simplify tasks involving defin

Dec 31, 2022
Lithia is an experimental functional programming language with an implicit but strong and dynamic type system.

Lithia is an experimental functional programming language with an implicit but strong and dynamic type system. Lithia is designed around a few core concepts in mind all language features contribute to.

Dec 24, 2022
A stack oriented esoteric programming language inspired by poetry and forth

paperStack A stack oriented esoteric programming language inspired by poetry and forth What is paperStack A stack oriented language An esoteric progra

Nov 14, 2021
A toy language parser, lexer and interpreter written in Golang

Monkey - A toy programming language Monkey is a toy programming language used to learn how to write a lexer, parser and interpreter. The language is i

Nov 16, 2021
Advent of Code is an Advent calendar of small programming puzzles for a variety of skill sets and skill levels that can be solved in any programming language you like.

Advent of Code 2021 Advent of Code is an Advent calendar of small programming puzzles for a variety of skill sets and skill levels that can be solved

Dec 2, 2021
Jan 4, 2022
Various Dungeons and Dragons Tools. Written in go as an aid to learning the language.

dnd_tools Various Dungeons and Dragons Tools. Written in go as an aid to learning the language. Some tools are generic, while others will target eithe

Jan 28, 2022
Some utilities for Persian language in Go (Golang)

persian Some utilities for Persian language in Go (Golang). Installation go get github.com/mavihq/persian API .ToPersianDigits Converts all English d

Oct 22, 2022
Unit tests generator for Go programming language
Unit tests generator for Go programming language

GoUnit GoUnit is a commandline tool that generates tests stubs based on source function or method signature. There are plugins for Vim Emacs Atom Subl

Jan 1, 2023