Encode and decode binary message and file formats in Go

Encode and Decode Binary Formats in Go

This module wraps the package encoding/binary of the Go standard library and provides the missing Marshal() and Unmarshal() functions a la encoding/json and encoding/xml.

Binary Message and File Formats

Messsage and file formats speecify how bits are arranged to encode information. Control over individual bits or groups of smaller than a byte is often required to put together and take apart these binary structures.

Message Formats

Example taken from RFC 791 defining the Internet Protocol:

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |Version|  IHL  |Type of Service|          Total Length         |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |         Identification        |Flags|      Fragment Offset    |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |  Time to Live |    Protocol   |         Header Checksum       |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                       Source Address                          |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                    Destination Address                        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                    Options                    |    Padding    |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                    Example Internet Datagram Header

                               Figure 4.

  Note that each tick mark represents one bit position.

  ...

  Type of Service:  8 bits

    ...

      Bits 0-2:  Precedence.
      Bit    3:  0 = Normal Delay,      1 = Low Delay.
      Bits   4:  0 = Normal Throughput, 1 = High Throughput.
      Bits   5:  0 = Normal Relibility, 1 = High Relibility.
      Bit  6-7:  Reserved for Future Use.

         0     1     2     3     4     5     6     7
      +-----+-----+-----+-----+-----+-----+-----+-----+
      |                 |     |     |     |     |     |
      |   PRECEDENCE    |  D  |  T  |  R  |  0  |  0  |
      |                 |     |     |     |     |     |
      +-----+-----+-----+-----+-----+-----+-----+-----+

        Precedence

          111 - Network Control
          110 - Internetwork Control
          101 - CRITIC/ECP
          100 - Flash Override
          011 - Flash
          010 - Immediate
          001 - Priority
          000 - Routine

Note how the Internet Datagram Header is visualised as a series of six 32-bit "words".

File Formats

Example taken from RFC 1952 defining the GZIP file format:

) +---+---+---+---+---+---+---+---+---+---+ ... 2.3.1. Member header and trailer ... FLG (FLaGs) This flag byte is divided into individual bits as follows: bit 0 FTEXT bit 1 FHCRC bit 2 FEXTRA bit 3 FNAME bit 4 FCOMMENT bit 5 reserved bit 6 reserved bit 7 reserved ">
   1.2. Intended audience

      ...

      The text of the specification assumes a basic background in
      programming at the level of bits and other primitive data
      representations.

   2.1. Overall conventions

      In the diagrams below, a box like this:

         +---+
         |   | <-- the vertical bars might be missing
         +---+

      represents one byte; ...

   2.2. File format

      A gzip file consists of a series of "members" (compressed data
      sets).  The format of each member is specified in the following
      section.  The members simply appear one after another in the file,
      with no additional information before, between, or after them.

   2.3. Member format

      Each member has the following structure:

         +---+---+---+---+---+---+---+---+---+---+
         |ID1|ID2|CM |FLG|     MTIME     |XFL|OS | (more-->)
         +---+---+---+---+---+---+---+---+---+---+

      ...

      2.3.1. Member header and trailer

         ...

         FLG (FLaGs)
            This flag byte is divided into individual bits as follows:

               bit 0   FTEXT
               bit 1   FHCRC
               bit 2   FEXTRA
               bit 3   FNAME
               bit 4   FCOMMENT
               bit 5   reserved
               bit 6   reserved
               bit 7   reserved

Working with Binary Formats in Go

The smallest data structure Go provides is the basic type byte (alias of int8). Bitwise logical and shift operators are however available.

Relevant Questions Posted on StackOverflow

Suggestions on StackOverflow in response to the above-described need are limited to the use of bitwise operators.

An Elegant Solution

Encoding and decoding binary formats in Go should be a matter of attaching tags to struct fields and calling Marshal()/Unmarshal().

package main

import (
	"log"

	"github.com/joel-ling/go-bitfields/pkg/encoding/binary"
)

type rfc791InternetHeaderWord0 struct {
	// Reference: RFC 791 Internet Protocol, Section 3.1 Internet Header Format
	// https://datatracker.ietf.org/doc/html/rfc791#section-3.1

	Version                  uint                           `bitfield:"4,28"`
	InternetHeaderLength     uint                           `bitfield:"4,24"`
	TypeOfServicePrecedence  rfc791InternetHeaderPrecedence `bitfield:"3,21"`
	TypeOfServiceDelay       bool                           `bitfield:"1,20"`
	TypeOfServiceThroughput  bool                           `bitfield:"1,19"`
	TypeOfServiceReliability bool                           `bitfield:"1,18"`
	TypeOfServiceReserved    uint                           `bitfield:"2,16"`
	TotalLength              uint                           `bitfield:"16,0"`
}

type rfc791InternetHeaderPrecedence byte

const (
	rfc791InternetHeaderPrecedenceRoutine rfc791InternetHeaderPrecedence = iota
	rfc791InternetHeaderPrecedencePriority
	rfc791InternetHeaderPrecedenceImmediate
	rfc791InternetHeaderPrecedenceFlash
	rfc791InternetHeaderPrecedenceFlashOverride
	rfc791InternetHeaderPrecedenceCRITICOrECP
	rfc791InternetHeaderPrecedenceInternetworkControl
	rfc791InternetHeaderPrecedenceNetworkControl
)

func main() {
	const (
		internetHeaderLength  = 5
		internetHeaderVersion = 4

		internetHeaderDelayNormal = false
		internetHeaderDelayLow    = true

		internetHeaderThroughputNormal = false
		internetHeaderThroughputHigh   = true

		internetHeaderReliabilityNormal = false
		internetHeaderReliabilityHigh   = true

		internetHeaderTotalLength = 30222

		reserved = 0
	)

	const (
		formatBytes  = "%08b"
		formatStruct = "%#v"
	)

	var (
		structure0 = rfc791InternetHeaderWord0{
			Version:                  internetHeaderVersion,
			InternetHeaderLength:     internetHeaderLength,
			TypeOfServicePrecedence:  rfc791InternetHeaderPrecedenceImmediate,
			TypeOfServiceDelay:       internetHeaderDelayLow,
			TypeOfServiceThroughput:  internetHeaderThroughputNormal,
			TypeOfServiceReliability: internetHeaderReliabilityHigh,
			TypeOfServiceReserved:    reserved,
			TotalLength:              internetHeaderTotalLength,
		}

		structure1 = rfc791InternetHeaderWord0{}
	)

	var (
		bytes []byte
		e     error
	)

	bytes, e = binary.Marshal(&structure0)
	if e != nil {
		log.Fatalln(e)
	}

	log.Printf(formatBytes, bytes)
	// [01000101 01010100 01110110 00001110]

	e = binary.Unmarshal(bytes, &structure1)
	if e != nil {
		log.Fatalln(e)
	}

	log.Printf(formatStruct, structure1 == structure0)
	// true
}

Large structures can be encoded and decoded in 32-bit instalments.

Struct Tags

The encoding of every struct field is determined by a struct tag of the following key and format:

bitfield:" , "

Length

refers to an integer representing the bit-length of a field, i.e. the number of bits in the field.

Offset

refers to an integer representing the number of places the bit field should be shifted left from the rightmost section of a 32-bit sequence for its position in that sequence to be appropriate. It is also the number of places to the right of the rightmost bit of the field.

Notice how in a fully utilised 32-bit word, the offset of every field is the cumulative sum of their lengths, when viewed from the rightmost field to the leftmost.

Similar Resources

Some Golang types based on builtin. Implements interfaces Value / Scan and MarshalJSON / UnmarshalJSON for simple working with database NULL-values and Base64 encoding / decoding.

gotypes Some simple types based on builtin Golang types that implement interfaces for working with DB (Scan / Value) and JSON (Marshal / Unmarshal). N

Feb 12, 2022

Asn.1 BER and DER encoding library for golang.

WARNING This repo has been archived! NO further developement will be made in the foreseen future. asn1 -- import "github.com/PromonLogicalis/asn1" Pac

Nov 14, 2022

idiomatic codec and rpc lib for msgpack, cbor, json, etc. msgpack.org[Go]

go-codec This repository contains the go-codec library, the codecgen tool and benchmarks for comparing against other libraries. This is a High Perform

Dec 19, 2022

Go library for decoding generic map values into native Go structures and vice versa.

mapstructure mapstructure is a Go library for decoding generic map values to structures and vice versa, while providing helpful error handling. This l

Jan 1, 2023

Easily and dynamically generate maps from Go static structures

structomap This package helps you to transform your struct into map easily. It provides a structomap.Serializer interface implemented by the structoma

Dec 9, 2022

Go package for dealing with maps, slices, JSON and other data.

Objx Objx - Go package for dealing with maps, slices, JSON and other data. Get started: Install Objx with one line of code, or update it with another

Dec 27, 2022

An optimal, byte-aligned, LZ+RLE hybrid encoder, designed to maximize decoding speed on NMOS 6502 and derived CPUs

An optimal, byte-aligned, LZ+RLE hybrid encoder, designed to maximize decoding speed on NMOS 6502 and derived CPUs

TSCrunch TSCrunch is an optimal, byte-aligned, LZ+RLE hybrid encoder, designed to maximize decoding speed on NMOS 6502 and derived CPUs, while keeping

Dec 21, 2022

Encode and Decode Message Length Indicators for TCP/IP socket based protocols

SimpleMLI A Message Length Indicator Encoder/Decoder Message Length Indicators (MLI) are commonly used in communications over raw TCP/IP sockets. This

Nov 24, 2022

Prometheus Common Data Exporter can parse JSON, XML, yaml or other format data from various sources (such as HTTP response message, local file, TCP response message and UDP response message) into Prometheus metric data.

Prometheus Common Data Exporter can parse JSON, XML, yaml or other format data from various sources (such as HTTP response message, local file, TCP response message and UDP response message) into Prometheus metric data.

Prometheus Common Data Exporter Prometheus Common Data Exporter 用于将多种来源(如http响应报文、本地文件、TCP响应报文、UDP响应报文)的Json、xml、yaml或其它格式的数据,解析为Prometheus metric数据。

May 18, 2022

Decode / encode XML to/from map[string]interface{} (or JSON); extract values with dot-notation paths and wildcards. Replaces x2j and j2x packages.

mxj - to/from maps, XML and JSON Decode/encode XML to/from map[string]interface{} (or JSON) values, and extract/modify values from maps by key or key-

Dec 22, 2022

Decode / encode XML to/from map[string]interface{} (or JSON); extract values with dot-notation paths and wildcards. Replaces x2j and j2x packages.

mxj - to/from maps, XML and JSON Decode/encode XML to/from map[string]interface{} (or JSON) values, and extract/modify values from maps by key or key-

Dec 29, 2022

Generate, encode, and decode UUIDs v1 with fast or cryptographic-quality random node identifier.

A Go package for generating and manipulating UUIDs Generate, encode, and decode UUIDs v1, as defined in RFC 4122, in Go. Project Status v1.1.0 Stable:

Sep 27, 2022

Encode and decode Go (golang) struct types via protocol buffers.

protostructure protostructure is a Go library for encoding and decoding a struct type over the wire. This library is useful when you want to send arbi

Nov 15, 2022

a little app to gzip+base64 encode and decode

GO=GZIP64 A little golang console utility that reads a file and either: 1) Encodes it - gzip compress followed by base64 encode writes

Oct 16, 2021

A Blurhash implementation in pure Go (Decode/Encode)

A Blurhash implementation in pure Go (Decode/Encode)

go-blurhash go-blurhash is a pure Go implementation of the BlurHash algorithm, which is used by Mastodon an other Fediverse software to implement a sw

Dec 27, 2022

:triangular_ruler:gofmtmd formats go source code block in Markdown. detects fenced code & formats code using gofmt.

:triangular_ruler:gofmtmd formats go source code block in Markdown. detects fenced code & formats code using gofmt.

gofmtmd gofmtmd formats go source code block in Markdown. detects fenced code & formats code using gofmt. Installation $ go get github.com/po3rin/gofm

Oct 31, 2022

Formats discord tokens to different formats.

Formats discord tokens to different formats.

token_formatter Formats discord tokens to different formats. Features Format your current tokens to a new format! Every tool uses a different format f

Nov 3, 2022

Sqlyog-password-decoder - Simple decode passwords from .sycs file (SQLyog export connections file)

Decode password: ./sqlyog-password-decoder -str password -action decode Encode p

Nov 21, 2021

Aces is a command line utility that lets you encode any file to a character set of your choice.

Aces Any Character Encoding Set Aces is a command line utility that lets you encode any file to a character set of your choice. For example, you could

Nov 28, 2022
Sqlyog-password-decoder - Simple decode passwords from .sycs file (SQLyog export connections file)

Decode password: ./sqlyog-password-decoder -str password -action decode Encode p

Nov 21, 2021
A standard way to wrap a proto message

Welcome to Pletter ?? A standard way to wrap a proto message Pletter was born with a single mission: To standardize wrapping protocol buffer messages.

Nov 17, 2022
A simple Slack message tool for the CLI written in Go

heka A simple Slack message tool for the CLI written in Go Report Bug · Request

Jan 16, 2022
Simple, specialised, and efficient binary marshaling

?? surge Documentation A library for fast binary (un)marshaling. Designed to be used in Byzantine networks, ?? surge never explicitly panics, protects

Oct 4, 2022
Golang binary decoder for mapping data into the structure

binstruct Golang binary decoder to structure Install go get -u github.com/ghostiam/binstruct Examples ZIP decoder PNG decoder Use For struct From file

Dec 17, 2022
binary serialization format

Colfer Colfer is a binary serialization format optimized for speed and size. The project's compiler colf(1) generates source code from schema definiti

Dec 25, 2022
Musgo is a Go code generator for binary MUS format with validation support.

Musgo is a Go code generator for binary MUS format with validation support. Generated code converts data to and from MUS format.

Dec 29, 2022
Fixed width file parser (encoder/decoder) in GO (golang)

Fixed width file parser (encoder/decoder) for GO (golang) This library is using to parse fixed-width table data like: Name Address

Sep 27, 2022
Cap'n Proto library and parser for go. This is go-capnproto-1.0, and does not have rpc. See https://github.com/zombiezen/go-capnproto2 for 2.0 which has rpc and capabilities.

Version 1.0 vs 2.0 Update 2015 Sept 20: Big news! Version 2.0 of the go-bindings, authored by Ross Light, is now released and newly available! It feat

Nov 29, 2022
csvutil provides fast and idiomatic mapping between CSV and Go (golang) values.
csvutil provides fast and idiomatic mapping between CSV and Go (golang) values.

csvutil Package csvutil provides fast and idiomatic mapping between CSV and Go (golang) values. This package does not provide a CSV parser itself, it

Jan 6, 2023