Slipstream is a method for lossless compression of power system data.

Slipstream

Slipstream is a method for lossless compression of power system data.

Design principles

  1. The protocol is designed for streaming raw measurement data, similar to the IEC 61850-9-2 Sampled Value protocol. It is designed to support high sample rate continuous point on wave (CPOW) voltage and current data, and also supports other measurement types.
  2. The data compression must be lossless.
  3. The protocol should be flexible. It should support any number of samples per message so that it can be used for different applications where efficient representation of measurement data is useful. For example, 2 to 8 samples per message may be appropriate for real-time applications while still benefitting from compression, and much larger messages could be used for archiving fault or event records, with excellent compression opportunities.
  4. Optimise for the lowest data size for minimal network data transfer and for small file sizes for saving event captures to permanent storage.
  5. Assume that out-of-band communications will agree sampling rate and number of variables to be transferred, similar to IEEE C37.118.2 configuration frames and STTP. This helps reduce the amount of data to be send in the main data stream to allow successful decoding.
  6. Ensure that data quality and time synchronisation information are strictly preserved and provided for every data sample.
  7. Prefer efficient encode and decode processing, with up-front allocations, where possible. However, the compression will naturally improve the end-to-end latency, due to the reduced data to be processed and transferred.
  8. The protocol should produce a byte stream which is suitable for a variety of transport methods, such as Ethernet, UDP, TCP, HTTP, WebSocket, and saving to a file.
  9. An error in one message should only invalidate that message, and not future messages.
  10. General purpose compression algorithms have already been shown to be ineffective or computationally expensive for CPOW data 1.

Data types

  • 32-bit signed integer for all data values. This requires a scaled integer representation for floating-point data, but this approach has already been adopted for IEC 61850-9-2 encoding. However, the protocol could be extended for directly representing floating-point values in the future, using the method in 2.
  • 32-bit unsigned integer for quality. This is intended to be based on the IEC 61850 quality specification, for which only 14 bits are used (including the "derived" indicator), and only 16 bits should ever be used. It is proposed here that the most significant byte is used for time quality, with the two least significant bytes used for data quality according to the IEC 61850 approach. The exact use is not prescribed at present, but 32 bits per data sample have been provisioned.
  • 64-bit signed integer for timestamp. This is based on the Go language representation, using nanoseconds relative to 1st January 1970 UTC, which is limited to a date between the years 1678 and 2262. Timestamps in STTP are restricted to 100 ns resolution, while suitable for output values such as frequency, it is very inaccurate for CPOW data, which could be sampled at inconvenient rates such as 14.4 kHz (so the 69444 ns sampling period would be truncated to 69400.00 ns, leading to an intrinsic 44.44 ns error). If the start of the data capture was always aligned to the second roll over point then the fraction of second value would always be zero, but the protocol should not be restricted in this way. Similarly, IEC 61850 and IEEE C37.118.2 timestamps only dedicate 24 bits to the fraction of second and have a poor resolution limit of 59.6 ns.

Protocol details

With knowledge of the underlying data, the compression method can be tailored for much greater performance than generic compression such as gzip 3. Each quantity is compressed and encoded separately. This is the same method used in 1, and is similar to how TimescaleDB performs compression on columns of data over time, rather than generically compressing each row as they are added to the database.

It is assumed that every sample is included for the defined message size. If a sample was missed (e.g. due to the sensor or underlying data source being unavailable), a zero sample should be added and the data quality should be adjusted appropriately. This simplifies the encoding and significantly reduces the amount of data to be sent because only the starting timestamp needs to be included per message, and all other timestamps can be inferred. Therefore, a single 64-bit field can encode the timestamp, rather than 64 bits per sample.

Wherever possible, variable length encoding is used (with zig-zag encoding for signed values, the same as Google Protocol Buffers).

The first sample must be encoded in full. The second sample is encoded as the difference from the first sample (delta encoding). All remaining samples are encoded using delta-delta encoding, and the number of "layers" of the delta-delta encoding can be configured. If a relatively large number of values is included per message (such as for an event record), simple-8b encoding can be used to improve the packing of the variable-length integer values. It is slightly better to use simple-8b for all values, even the first and second values.

The quality is assumed to not change very often. Therefore, it is encoded using run-length encoding (RLE). A special run-length of 0 is used to represent that all future values within the same message are the same. So, for the common case where the quality value is 0 for all samples, that can be encoded in one byte for the value plus one byte for the number of samples.

There are four sections of each message using the protocol:

  1. Header
  2. First sample data encoding
  3. Second and later sample encoding
  4. Quality values for each sample

The protocol header contains the following fields:

  1. UUID, 16 bytes
  2. Timestamp of the first sample, 8 bytes
  3. Number of encoded samples, variable length

The next thing to encode is the first sample of each variable. Then, each sample is encoded using delta or delta-delta encoding. After all samples are encoded, the quality RLE section is encoded.

Compression performance

Compression performance can typically reduce data to about 15% of the theoretical uncompressed sample size (assuming 4 bytes for data, 4 bytes for quality, and 8 bytes for timestamp). Higher sampling rates can compress down to <5%, often requiring less than 1 byte per new sample on average. Shorter messages with fewer samples will achieve compression of 15-25%. Compared to IEC 61850-9-2 SV, the performance is even better due to the additional overhead and repeated data inherent in the SV ASDU structure. For example, sampling at 14.4 kHz with 6 samples (ASDUs) per message (using the "LE" dataset with 8 variables) requires about 589 bytes for SV (including Ethernet header, but not including the "RefrTm" timestamp). This new protocol only requires about 134 bytes to convey the same information.

The protocol compression tends to perform better for higher sampling rates, because the difference between samples is less and, on average, fewer bytes are required. Similarly, the protocol compression tends to perform worse for larger RMS values of voltage and current because the differences between samples is greater.

A disadvantage of the protocol is that changes in data values or quality values will increase the message size. This means that more data must be send or recorded when important or interesting events occur, compared with the steady-state.

Random noise in the encoded quantities will reduce compression performance. Harmonics will also have this effect, but to a lesser extent.

Other

Decoders must have knowledge of the encoding parameters. This means that Wireshark may be unable to provide diagnostic information, unless it is also able to access and decode the out-of-band data which describes the protocol instance (i.e. the sampling rate and number of variables).

It is not possible to decode the quality values until all the data values in a message are decoded first.

It is assumed that three-phase quantities should also include a neutral component, similar to the IEC 61850 "LE" profile.

The IEC 61850-9-2 SV protocol allows some flexibility. For example, in principle, ASDUs in the same message could be mixed from different datasets. This new protocol does not allow this. However, it would be simpler modify the SV dataset to encompass the additional data, rather than having multiple dataset, or send separate SV streams. It would make also make it easier to encode and decode, compared to mixing ASDU from different datasets.

Internally, the encoder uses an alternating ping-pong buffer. This means that it is acceptable to read the output of the encoder while a new message starts being encoded. However, the output from the first message must be fully saved or copied before a third message is started. The encoder is not thread-safe, so a single instance should only be used from the same thread. This to ensure that the order of calls to Encode() is preserved. While mutex locking will synchronise access, it does not queue subsequent calls to Encode().

Tests

You can run the test suite locally with:

go test -v

References

Footnotes

  1. Blair, S. M., Roscoe, A. J., & Irvine, J. (2016). Real-time compression of IEC 61869-9 sampled value data. 2016 IEEE International Workshop on Applied Measurements for Power Systems (AMPS), 1–6. https://doi.org/10.1109/AMPS.2016.7602854 https://strathprints.strath.ac.uk/57710/1/Blair_etal_AMPS2016_Real_time_compression_of_IEC_61869_9_sampled_value_data.pdf 2

  2. Andersen, M. P., & Culler, D. (2016). BTrDB: Optimizing Storage System Design for Timeseries Processing. 14th USENIX Conference on File and Storage Technologies (FAST ’16). https://www.usenix.org/conference/fast16/technical-sessions/presentation/andersen

  3. https://arxiv.org/pdf/1209.2137.pdf

Similar Resources

A library for communication with solar power inverters of the RCT power brand, not endorsed by or affiliated with the eponymous company.

rct A library for communication with solar power inverters of the RCT power brand. Tested with the RCT PS 6.0 solar power inverter, battery and grid p

Dec 21, 2022

Kanzi is a modern, modular, expendable and efficient lossless data compressor implemented in Go.

kanzi Kanzi is a modern, modular, expendable and efficient lossless data compressor implemented in Go. modern: state-of-the-art algorithms are impleme

Dec 22, 2022

Package flac provides access to FLAC (Free Lossless Audio Codec) streams.

flac This package provides access to FLAC (Free Lossless Audio Codec) streams. Documentation Documentation provided by GoDoc. flac: provides access to

Jan 5, 2023

Automatically power off system when network interface is down

passer A tiny tool can automatically power off system when network interface is

Apr 23, 2022

The full power of the Go Compiler directly in your browser, including a virtual file system implementation. Deployable as a static website.

The full power of the Go Compiler directly in your browser, including a virtual file system implementation. Deployable as a static website.

Static Go Playground Features Full Go Compiler running on the browser. Supports using custom build tags. Incremental builds (build cache). Supports mu

Jun 16, 2022

Kepler (Kubernetes-based Efficient Power Level Exporter) uses eBPF to probe energy related system stats and exports as Prometheus metrics

Kepler (Kubernetes-based Efficient Power Level Exporter) uses eBPF to probe energy related system stats and exports as Prometheus metrics

kepler Kepler (Kubernetes Efficient Power Level Exporter) uses eBPF to probe energy related system stats and exports as Prometheus metrics Architectur

Dec 26, 2022

Interfaces for LZ77-based data compression

Pack Interfaces for LZ77-based data compression. Introduction Many compression libraries have two main parts: Something that looks for repeated sequen

Oct 19, 2021

Serpscan is a powerfull php script designed to allow you to leverage the power of dorking straight from the comfort of your command line.

Serpscan is a powerfull php script designed to allow you to leverage the power of dorking straight from the comfort of your command line.

SerpScan Serpscan is a powerful PHP tool designed to allow you to leverage the power of dorking straight from the comfort of your command line. Table

Nov 11, 2022

The power of curl, the ease of use of httpie.

The power of curl, the ease of use of httpie.

Curlie If you like the interface of HTTPie but miss the features of curl, curlie is what you are searching for. Curlie is a frontend to curl that adds

Dec 27, 2022

A yet to be voice call application in terminal. with the power of go and webRTC (pion).

Kenny I'm just trying to make a cli operated voice call chat application using go with help of webRTC and PortAudio. It might stay a Work In Progress

Dec 2, 2022

A yet to be voice call application in terminal. with the power of go and webRTC (pion).

A yet to be voice call application in terminal. with the power of go and webRTC (pion).

Dec 2, 2022

CrowdSec - an open-source massively multiplayer firewall able to analyze visitor behavior & provide an adapted response to all kinds of attacks. It also leverages the crowd power to generate a global IP reputation database to protect the user network.

CrowdSec - an open-source massively multiplayer firewall able to analyze visitor behavior & provide an adapted response to all kinds of attacks. It also leverages the crowd power to generate a global IP reputation database to protect the user network.

📚 Documentation 💠 Configuration Hub 💬 Discourse (Forum) 💬 Gitter (Live chat) 💃 This is a community driven project, we need your feedback. TL;DR

Jan 5, 2023

Glow is a terminal based markdown reader designed from the ground up to bring out the beauty—and power—of the CLI.💅🏻

Glow is a terminal based markdown reader designed from the ground up to bring out the beauty—and power—of the CLI.💅🏻

Glow Render markdown on the CLI, with pizzazz! What is it? Glow is a terminal based markdown reader designed from the ground up to bring out the beaut

Dec 30, 2022

Power-ups for the daily DevOps life

DevOps Loop Power-Ups Requirements Connected Kubernetes cluster. Some features need support for LoadBalancer services Permission to list, create and d

Nov 3, 2022

packM 🧬 is fivem resource compiler for golang with the power of golang+typescript+webpack

packM 🧬 packM 🧬 is fivem resource compiler for golang ,typescript with the power of golang+typescript compiler+webpack

Jun 28, 2022

How to build production-level services in Go leveraging the power of Kubernetes

Ultimate Service Copyright 2018, 2019, 2020, 2021, Ardan Labs [email protected] Ultimate Service 3.0 Classes This class teaches how to build producti

Oct 22, 2021

Redis power demonstration

redis-mssql Redis power demonstration! This is a simple Go API for demonstrate how redis can help serving images. It can retrieve one image, or it can

Nov 25, 2021

A simple and super power logger for golang

A simple and super power logger for golang

The most powerfull and faster logger for golang powered by DC 🤔 What is this? W

Oct 18, 2022

Implementations of Power VS Provider for the OpenShift machine-api

Machine API Provider Power VS This repository contains implementations of Power VS Provider for the OpenShift machine-api. This provider runs as a mac

Jan 31, 2022
Comments
  • Questions / Usage

    Questions / Usage

    Hi Steven,

    Cool stuff here, enjoyed your NASPI presentation.

    Firstly, as the STTP standard is not finalized yet, it would be great to address your timestamp concern before the standard is released:

    Timestamps in STTP are restricted to 100 ns resolution, while suitable for output values such as synchrophasors and frequency, it is very inaccurate for CPOW data, which could be sampled at inconvenient rates such as 14.4 kHz (so the 69444 ns sampling period would be truncated to 69400.00 ns, leading to an intrinsic 44.44 ns error). If the start of the data capture was always aligned to the second roll over point then the fraction of second value would always be zero, but the protocol should not be restricted in this way

    Looks like something that should be easy enough to accommodate via a setting. Would you recommend nanosecond resolution, higher?

    Secondly, since STTP can support other compression algorithms, would it be OK if we deployed an STTP implementation that included your protocol as a compression option? The license matches STTP implementations; perhaps this API target (since it is also in Go): https://github.com/sttp/goapi - could even run a comparison to the built in compression algorithm, which is fairly simple?

Optimized compression packages

compress This package provides various compression algorithms. zstandard compression and decompression in pure Go. S2 is a high performance replacemen

Jan 4, 2023
Go wrapper for LZO compression library

This is a cgo wrapper around the LZO real-time compression library. LZO is available at http://www.oberhumer.com/opensource/lzo/ lzo.go is the go pack

Mar 4, 2022
LZ4 compression and decompression in pure Go

lz4 : LZ4 compression in pure Go Overview This package provides a streaming interface to LZ4 data streams as well as low level compress and uncompress

Dec 27, 2022
Go parallel gzip (de)compression

pgzip Go parallel gzip compression/decompression. This is a fully gzip compatible drop in replacement for "compress/gzip". This will split compression

Dec 29, 2022
Unsigned Integer 32 Byte Packing Compression

dbp32 Unsigned Integer 32 Byte Packing Compression. Inspired by lemire/FastPFor. Package bp32 is an implementation of the binary packing integer compr

Sep 6, 2021
Bzip2 Compression Tool written in Go

Bzip2 Compression Tool written in Go

Dec 28, 2021
An easy-to-use CLI-based compression tool.

Easy Compression An easy-to-use CLI-based compression tool. Usage NAME: EasyCompression - A CLI-based tool for (de)compression USAGE: EasyCompr

Jan 1, 2022
zlib compression tool for modern multi-core machines written in Go

zlib compression tool for modern multi-core machines written in Go

Jan 21, 2022
Port of LZ4 lossless compression algorithm to Go

go-lz4 go-lz4 is port of LZ4 lossless compression algorithm to Go. The original C code is located at: https://github.com/Cyan4973/lz4 Status Usage go

Jun 14, 2022
An effective time-series data compression/decompression method based on Facebook's Gorilla.

Gorilla This package provides the effective time-series data compression method based on Facebook's Gorilla.. In a nutshell, it uses delta-of-delta ti

Sep 26, 2022