Fastq demultiplexer for single cell data from MGI sequencer (10x converted library).

fastq_demultiplexer

Converts fastq single cell data from MGI (10x converted library) to Illumina compatible format.


Installation

go install github.com/overerd/fastq_demultiplexer@latest

Arguments

Required:

  • -1|--r1 and -2|--r2 expect a R1-R2 pair of fastq files.
  • -c|--csv-file table with barcodes for each index.
  • -o|--output-directory output directory (would contain subdirectories of demultiplexed samples).

Optional:

  • -s|--csv-separator barcode csv-file separator (default: ',').
  • --targets-file path to file with targets (if null, would select all possible indexes from barcodes file).
  • --transform-strategy strategy of how to transform fastq data (supported strategies: 10x, 10x_no_index) (default: 10x)
  • --filename-template filename template (default: '{{.SampleName}}_S{{.SampleNumber}}L00{{.LaneNumber}}{{.ReadType}}_001.fastq.gz').
  • --lane-number lane number for selected fastq pair (default: 1).
  • --buffer-size sets buffer size in bytes for reading fastq files (should be set and increased if necessary to avoid "take too long" error when reading fastq files with long lines) (default: 10 * 1024 * 1024).
  • --block-size sets buffer size for accumulating multiple paired reads in transformation pipeline (default: 4 * 2 * 1024).
  • --compression-level output gzip compression level if applicable [1, 9] (default: 1).
  • --debug enables debug messages.

Filename template:

It uses golang template syntax {{.Variable}}.

Template {{.SampleName}}_S{{.SampleNumber}}_L00{{.LaneNumber}}_{{.ReadType}}_001.fastq.gz would result in filenames like H2_S1_L001_R2_001.fastq.gz.

Supported template variables:
  • {{.SampleName}} - index name from barcodes csv-file
  • {{.SampleNumber}} - index number
  • {{.LaneNumber}} - --lane-number value
  • {{.ReadType}} - read type (could be R1, R2 and I1)

Example

fastq_demultiplexer \
    -1 v350013347_run65_L01_read_1.fq.gz \
    -2 v350013347_run65_L01_read_2.fq.gz \
    -c Single_Index_Kit_T_Set_A.csv \
    --lane-number 1 \
    --block-size 1000 \
    --targets-file targets.txt \
    --transform-strategy 10x_no_index \
    --filename-template {{.SampleName}}_S{{.SampleNumber}}_L00{{.LaneNumber}}_{{.ReadType}}_001.fastq.gz \
    -o output/ \
    --debug
Similar Resources

Cell is a Go package that creates new instances by string in running time.

Cell Cell is a Go package that creates new instances by string in running time. Getting Started Installing To start using CELL, install Go and run go

Dec 20, 2021

golife is a cell evolution simulator.

golife is a cell evolution simulator.

golife WORK IN PROGRESS golife is a cell evolution simulator. It presents autonomous organism with inner properties and behavior (hunter or prey) that

Dec 30, 2021

Gcra - Package gcra implements the generic cell rate algorithm

gcra Package gcra implements the generic cell rate algorithm (GCRA). Example opt

Jan 23, 2022

Run your MapReduce workloads as a single binary on a single machine with multiple CPUs and high memory. Pricing of a lot of small machines vs heavy machines is the same on most cloud providers.

gomap Run your MapReduce workloads as a single binary on a single machine with multiple CPUs and high memory. Pricing of a lot of small machines vs he

Sep 16, 2022

Convert data exports from various services to a single SQLite database

Convert data exports from various services to a single SQLite database

Bionic Bionic is a tool to convert data exports from web apps to a single SQLite database. Bionic currently supports data exports from Google, Apple H

Dec 9, 2022

Dasel - Select, put and delete data from JSON, TOML, YAML, XML and CSV files with a single tool.

Dasel - Select, put and delete data from JSON, TOML, YAML, XML and CSV files with a single tool.

Select, put and delete data from JSON, TOML, YAML, XML and CSV files with a single tool. Supports conversion between formats and can be used as a Go package.

Jan 1, 2023

An in-memory key:value store/cache (similar to Memcached) library for Go, suitable for single-machine applications.

go-cache go-cache is an in-memory key:value store/cache similar to memcached that is suitable for applications running on a single machine. Its major

Dec 29, 2022

Playwright for Go a browser automation library to control Chromium, Firefox and WebKit with a single API.

Playwright for Go a browser automation library to control Chromium, Firefox and WebKit with a single API.

🎭 Playwright for API reference | Example recipes Playwright is a Go library to automate Chromium, Firefox and WebKit with a single API. Playwright is

Jan 1, 2023

An in-memory key:value store/cache (similar to Memcached) library for Go, suitable for single-machine applications.

go-cache go-cache is an in-memory key:value store/cache similar to memcached that is suitable for applications running on a single machine. Its major

Dec 29, 2022

The Hyperscale InputFilter library provides a simple inputfilter chaining mechanism by which multiple filters and validator may be applied to a single datum in a user-defined order.

Hyperscale InputFilter Branch Status Coverage master The Hyperscale InputFilter library provides a simple inputfilter chaining mechanism by which mult

Oct 20, 2021

An in-memory key:value store/cache (similar to Memcached) library for Go, suitable for single-machine applications.

go-cache go-cache is an in-memory key:value store/cache similar to memcached that is suitable for applications running on a single machine. Its major

Jan 3, 2023

Totem - A Go library that can turn a single gRPC stream into bidirectional unary gRPC servers

Totem is a Go library that can turn a single gRPC stream into bidirectional unar

Jan 6, 2023

Optimus is an easy-to-use, reliable, and performant workflow orchestrator for data transformation, data modeling, pipelines, and data quality management.

Optimus Optimus is an easy-to-use, reliable, and performant workflow orchestrator for data transformation, data modeling, pipelines, and data quality

Jan 6, 2023

Prometheus Common Data Exporter can parse JSON, XML, yaml or other format data from various sources (such as HTTP response message, local file, TCP response message and UDP response message) into Prometheus metric data.

Prometheus Common Data Exporter can parse JSON, XML, yaml or other format data from various sources (such as HTTP response message, local file, TCP response message and UDP response message) into Prometheus metric data.

Prometheus Common Data Exporter Prometheus Common Data Exporter 用于将多种来源(如http响应报文、本地文件、TCP响应报文、UDP响应报文)的Json、xml、yaml或其它格式的数据,解析为Prometheus metric数据。

May 18, 2022

Govalid is a data validation library that can validate most data types supported by golang

Govalid is a data validation library that can validate most data types supported by golang. Custom validators can be used where the supplied ones are not enough.

Apr 22, 2022

A Go (golang) package for representing a list of errors as a single error.

go-multierror go-multierror is a package for Go that provides a mechanism for representing a list of error values as a single error. This allows a fun

Jan 1, 2023

:file_folder: Embeds static resources into go files for single binary compilation + works with http.FileSystem + symlinks

Package statics Package statics embeds static files into your go applications. It provides helper methods and objects to retrieve embeded files and se

Sep 27, 2022

Selenium Hub successor running browsers within containers. Scalable, immutable, self hosted Selenium-Grid on any platform with single binary.

Selenium Hub successor running browsers within containers. Scalable, immutable, self hosted Selenium-Grid on any platform with single binary.

Selenoid Selenoid is a powerful implementation of Selenium hub using Docker containers to launch browsers. Features One-command Installation Start bro

Jan 5, 2023

Packer is a tool for creating identical machine images for multiple platforms from a single source configuration.

Packer Website: https://www.packer.io IRC: #packer-tool on Freenode Mailing list: Google Groups Packer is a tool for building identical machine images

Jan 8, 2023
Cheap/fast/simple XLSX file writer for textual data

xlsxwriter Cheap/fast/simple XLSX file writer for textual data -- no fancy formatting or graphs go get github.com/mzimmerman/xlsxwriter data := [][]s

Feb 8, 2022
Golang library for reading and writing Microsoft Excel™ (XLSX) files.
Golang library for reading and writing Microsoft Excel™ (XLSX) files.

Excelize Introduction Excelize is a library written in pure Go providing a set of functions that allow you to write to and read from XLSX / XLSM / XLT

Jan 5, 2023
Go (golang) library for reading and writing XLSX files.

XLSX Introduction xlsx is a library to simplify reading and writing the XML format used by recent version of Microsoft Excel in Go programs. Tutorial

Dec 28, 2022
Pure go library for creating and processing Office Word (.docx), Excel (.xlsx) and Powerpoint (.pptx) documents
Pure go library for creating and processing Office Word (.docx), Excel (.xlsx) and Powerpoint (.pptx) documents

unioffice is a library for creation of Office Open XML documents (.docx, .xlsx and .pptx). Its goal is to be the most compatible and highest performan

Jan 4, 2023
Tracker-style microtonal MIDI sequencer

Faunatone A tracker-style microtonal MIDI sequencer. Since MIDI does not have any widely-implemented native support for microtonality, Faunatone uses

Oct 25, 2022
Converts a number to its English counterpart. Uses arbitrary precision; so a number of any size can be converted.

Converts a number to its English counterpart. Uses arbitrary precision; so a number of any size can be converted.

Dec 14, 2021
TCG - terminal cell graphics library
TCG - terminal cell graphics library

TCG - terminal cell graphics Go Graphics library for use in a text terminal. Only 1bit graphics can be used with two colors. Used unicode block symbol

Dec 14, 2022
Fast HTTP package for Go. Tuned for high performance. Zero memory allocations in hot paths. Up to 10x faster than net/http
Fast HTTP package for Go. Tuned for high performance. Zero memory allocations in hot paths. Up to 10x faster than net/http

fasthttp Fast HTTP implementation for Go. Currently fasthttp is successfully used by VertaMedia in a production serving up to 200K rps from more than

Jan 5, 2023
Fast HTTP package for Go. Tuned for high performance. Zero memory allocations in hot paths. Up to 10x faster than net/http
Fast HTTP package for Go. Tuned for high performance. Zero memory allocations in hot paths. Up to 10x faster than net/http

fasthttp Fast HTTP implementation for Go. Currently fasthttp is successfully used by VertaMedia in a production serving up to 200K rps from more than

Jan 2, 2023
skipmap is a high-performance concurrent sorted map based on skip list. Up to 3x ~ 10x faster than sync.Map in the typical pattern.
skipmap is a high-performance concurrent sorted map based on skip list. Up to 3x ~ 10x faster than sync.Map in the typical pattern.

Introduction skipmap is a high-performance concurrent map based on skip list. In typical pattern(one million operations, 90%LOAD 9%STORE 1%DELETE), th

Jan 8, 2023