Sequence-based Go-native audio mixer for music apps

Mix

Build Status GoDoc codebeat badge Awesome

https://github.com/go-mix/mix

Sequence-based Go-native audio mixer for music apps

See demo/demo.go:

package main

import (
  "fmt"
  "os"
  "time"
  
  "github.com/go-mix/mix"
  "github.com/go-mix/mix/bind"
)

var (
  sampleHz   = float64(48000)
  spec = bind.AudioSpec{
    Freq:     sampleHz,
    Format:   bind.AudioF32,
    Channels: 2,
    }
  bpm        = 120
  step       = time.Minute / time.Duration(bpm*4)
  loops      = 16
  prefix     = "sound/808/"
  kick1      = "kick1.wav"
  kick2      = "kick2.wav"
  marac      = "maracas.wav"
  snare      = "snare.wav"
  hitom      = "hightom.wav"
  clhat      = "cl_hihat.wav"
  pattern    = []string{
    kick2,
    marac,
    clhat,
    marac,
    snare,
    marac,
    clhat,
    kick2,
    marac,
    marac,
    hitom,
    marac,
    snare,
    kick1,
    clhat,
    marac,
  }
)

func main() {
  defer mix.Teardown()    
  
  mix.Debug(true)
  mix.Configure(spec)
  mix.SetSoundsPath(prefix)
  mix.StartAt(time.Now().Add(1 * time.Second))

  t := 2 * time.Second // padding before music
  for n := 0; n < loops; n++ {
    for s := 0; s < len(pattern); s++ {
      mix.SetFire(pattern[s], t+time.Duration(s)*step, 0, 1.0, 0)
    }
    t += time.Duration(len(pattern)) * step
  }

  fmt.Printf("Mix, pid:%v, spec:%v\n", os.Getpid(), spec)
  for mix.FireCount() > 0 {
    time.Sleep(1 * time.Second)
  }
}

Play this Demo from the root of the project, with no actual audio playback:

make demo

Or export WAV via stdout > demo/output.wav:

make demo.wav
Credit

Charney Kaye

XJ Music Inc.

What?

Game audio mixers are designed to play audio spontaneously, but when the timing is known in advance (e.g. sequence-based music apps) there is a demand for much greater accuracy in playback timing.

Read the API documentation at godoc.org/github.com/go-mix/mix

Mix seeks to solve the problem of audio mixing for the purpose of the playback of sequences where audio files and their playback timing is known in advance.

Mix stores and mixes audio in native Go []float64 and natively implements Paul Vögler's "Loudness Normalization by Logarithmic Dynamic Range Compression" (details below)

Best efforts will be made to preserve each API version in a release tag that can be parsed, e.g. github.com/go-mix/mix

Why?

Even after selecting a hardware interface library such as PortAudio or C++ SDL 2.0, there remains a critical design problem to be solved.

This design is a music application mixer. Most available options are geared towards Game development.

Game audio mixers offer playback timing accuracy +/- 2 milliseconds. But that's totally unacceptable for music, specifically sequence-based sample playback.

The design pattern particular to Game design is that the timing of the audio is not know in advance- the timing that really matters is that which is assembled in near-real-time in response to user interaction.

In the field of Music development, often the timing is known in advance, e.g. a sequencer, the composition of music by specifying exactly how, when and which audio files will be played relative to the beginning of playback.

Ergo, mix seeks to solve the problem of audio mixing for the purpose of the playback of sequences where audio files and their playback timing is known in advance. It seeks to do this with the absolute minimal logical overhead on top of the audio interface.

Mix takes maximum advantage of Go by storing and mixing audio in native Go []float64 and natively implementing Paul Vögler's "Loudness Normalization by Logarithmic Dynamic Range Compression"

Time

To the Mix API, time is specified as a time.Duration-since-epoch, where the epoch is the moment that mix.Start() was called.

Internally, time is tracked as samples-since-epoch at the master out playback frequency (e.g. 48000 Hz). This is most efficient because source audio is pre-converted to the master out playback frequency, and all audio maths are performed in terms of samples.

The Mixing Algorithm

Inspired by the theory paper "Mixing two digital audio streams with on the fly Loudness Normalization by Logarithmic Dynamic Range Compression" by Paul Vögler, 2012-04-20. A .PDF has been included here, from the paper originally published here.

Usage

There's a demo implementation of mix included in the demo/ folder in this repository. Run it using the defaults:

cd demo && go get && go run demo.go

Or specify options, e.g. using WAV bytes to stdout for playback (piped to system native aplay)

go run demo.go --out wav | aplay

To show the help screen:

go run demo.go --help
Comments
  • Proposal: let the client specify an `io.Writer` instead of assuming `os.Stdout`

    Proposal: let the client specify an `io.Writer` instead of assuming `os.Stdout`

    Hello, this looks like a very cool library. I'm excited to jump in and play with it.

    Would it be a good idea to let the client choose their own io.Writer for audio data to be written to. It would be nice to let the client flexibly stream over tcp / write to file / pipe to another process all using the same underlying mixing implementation with different writers.

    It does seem like it would require a change to the public interface, so that's something to consider.

    Thoughts?

  • Allow output to io.Writer and implement teardown

    Allow output to io.Writer and implement teardown

    Allows specifying the location to write data to (write to any io.Writer).

    Implements Teardown in mix to reset fires and outputToDur to allow subsequent outputs to work.

  • SetFire() blocks process.

    SetFire() blocks process.

    Tested on Ubuntu with output to SDL:

    1. 500 sounds are queued to ontomix
    2. During playback, 500 more sounds are queued.
    3. Expect that playback with be uninterrupted. Actually, playback goes null momentarily. Larger amounts of sounds during step 2 result in longer playback gaps.
  • "Get More Fires" Callback function in API

    The use case is a Music app that implements ontomix for

    • indefinitely long real-time playback, or
    • output to file (likely to happen faster-then-real-time)

    The Music app will call the new Ontomix API method, SetNextFiresCallback and pass in a func() which in turn lets the Music app know that Ontomix is ready for more fires.

    The Music app will call new Ontomix API method, SetNextFiresBufferDuration in order to configure the amount of time that Ontomix considers enough "buffer" before sending the next callback message.

  • Is the PortAudio playback binding unstable?

    Is the PortAudio playback binding unstable?

    Early testing on Ubuntu 14 has crashed often. Relates to #17

    So far, I've discovered no mis-implementation in the portaudio bindings. Comments welcome.

    ALSA #21 and SDL #22 enable more options for playback bindings, to choose the most stable option for any given system.

  • Add go-sox loader interface

    Add go-sox loader interface

    Use github.com/krig/go-sox package to use formats supported by sox for input. Unfortunately it requires linking to libsox via cgo, which is quite heavy dependency.

  • Mix cycle includes garbage collection of unused sources.

    Mix cycle includes garbage collection of unused sources.

    Each mix cycle:

    1. Make a new empty map[string]*Source, e.g. keepSources
    2. While iterating over the ready & active fires (see issues #11 and #18; implemented as of pull #29) copy any used *Source to the new keepSources
    3. Replace the mixSources with keepSources
  • Deeper native implementation of go-riff with custom WAV binding

    Deeper native implementation of go-riff with custom WAV binding

    Researching available options, I'm disappointed with what's available for WAV file decoding (specifically, integer vs. float files and parsing extended metadata) therefore I'm rolling my own wav binding on top of youpy/go-riff

  • Is it better to track mix-output frequency as int or float64?

    Is it better to track mix-output frequency as int or float64?

    Currently the mix-output frequency is tracked as float64.

    There is debate whether a "sample rate" is always an integer. I could make an argument for using int instead:

    Assuming the mixing frequency will be:

    • greater than zero
    • in the hundreds of thousands, at the largest.

    But perhaps it's better to leave it as-is, a float64, because it does function nominally at present (if for whatever reason it is desirable to specify an extremely precise velocity of samples-per-second)

  • Deprecate hardware playback functionality for an agnostic next-bytes-provider function

    Deprecate hardware playback functionality for an agnostic next-bytes-provider function

    The concept of hardware playback is ultimately outside of the focus of this project.

    Depending on how a developer wants to implement this library in their project, they might:

    • Pipe the stdout bytes to a .WAV file, or system-native aplay
    • Implement hardware playback using the library of their choice, and retrieve bytes on callback from an agnostic next-bytes-provider function
  • Audio time-scale/pitch modification

    Audio time-scale/pitch modification

    From https://en.wikipedia.org/wiki/Audio_time-scale/pitch_modification

    Time stretching is the process of changing the speed or duration of an audio signal without affecting its pitch. Pitch scaling or pitch shifting is the opposite: the process of changing the pitch without affecting the speed. Similar methods can change speed, pitch, or both at once, in a time-varying way.

    These processes are used, for instance, to match the pitches and tempos of two pre-recorded clips for mixing when the clips cannot be reperformed or resampled. (A drum track containing no pitched instruments could be moderately resampled for tempo without adverse effects, but a pitched track could not). They are also used to create effects such as increasing the range of an instrument (like pitch shifting a guitar down an octave).

  • Concurrent processor usage for mixing

    Concurrent processor usage for mixing

    Especially in a situation where the output is being written directly to stdout, it would be optimal to be able to distribute the load of mathematical operations to multiple processors.

    Examine the sample.OutNext() binding:

    // OutNext to mix the next sample for all channels, in []float64
    func OutNext() []Value {
        return outNextCallback()
    }
    

    If we assign some number n to represent the number of concurrent mix processes, then usage of this sample.OutNext() could be batched such that instead of retrieving one sample at a time, the samples are retrieved in a single map-reduce sweep of n goroutines performing the math. Ergo, the final function signature might look more like:

    // OutNextConcurrent() to concurrently mix the next samples for all channels, in []float64
    func OutNextConcurrent() []Sample {
        // TODO: map-reduce goroutines of n mix-samples
    }
    
  • Evaluate and write tests around WAV binding reading & writing

    Evaluate and write tests around WAV binding reading & writing

    • Is it enough to support only 8- and 16- bit integer, 32- and 64- bit float audio?
    • Is WAV reading correctly for signed vs. unsigned integer audio?
    • Is WAV writing correctly for signed vs. unsigned integer audio?
  • Write tests to ensure respect for implicit panning of source channels, e.g. 2 channels = stereo L/R

    Write tests to ensure respect for implicit panning of source channels, e.g. 2 channels = stereo L/R

    2-channel source audio playing through a 2-channel output does in fact successfully carry the stereo assumption from source to output However, this issue will officially persist until there are at least tests to prove otherwise.

    In mix.go, func mixNextSample() []float64 returns one sample per channel- hence the return array of samples []float64 is "one" sample with multiple channels.

    It will be necessary for the mixer to respect the implied panning of certain channel layouts, e.g. "Stereo" = 2 channels = channel 1 pan left + channel 2 pan right

    For now, the only implied panning assumption we will implement is Stereo.

    In the future, it's conceivable that more complex panning assumptions could be implemented, e.g. "Surround Sound"

Related tags
Package flac provides access to FLAC (Free Lossless Audio Codec) streams.

flac This package provides access to FLAC (Free Lossless Audio Codec) streams. Documentation Documentation provided by GoDoc. flac: provides access to

Jan 5, 2023
GAAD (Go Advanced Audio Decoder)

GAAD (Go Advanced Audio Decoder) Package currently provides AAC parsing capabilities. This package performs a full parse of AAC-LC and HE-AACv1 bitstr

Oct 24, 2022
Go tools for audio processing & creation ?

GoAudio ? GoAudio is an audio processing library, currently supporting WAVE files, although some tools such as the synth and breakpoints are encoding

Dec 23, 2022
Mini audio library

malgo Go bindings for miniaudio library. Requires cgo but does not require linking to anything on the Windows/macOS and it links only -ldl on Linux/BS

Dec 31, 2022
Go bindings for the PortAudio audio I/O library

portaudio This package provides an interface to the PortAudio audio I/O library. See the package documentation for details. To build this package you

Jan 1, 2023
Go package capable of generating waveform images from audio streams. MIT Licensed.

waveform Go package capable of generating waveform images from audio streams. MIT Licensed. This library supports any audio streams which the azul3d/e

Nov 17, 2022
CLI audio player written in go.
CLI audio player written in go.

A very minimal CLI audio player.

Dec 13, 2022
alto is a program built for audio management.
alto is a program built for audio management.

alto, a music organizer alto is a program built for audio management. It's purpose is to provide the user the means to create a path construct to move

Oct 10, 2022
Terrible Audio Downloader

Terrible Audio Downloader This is just a small go project I did for myself, to manage my audio library to get away from spotify. All it does is downlo

Oct 30, 2021
Audio visualizer in Go
Audio visualizer in Go

demo_audio_visualizer Simple audio visualizer in Go Used libraries: raylib-go go-mp3 oto go-dsp To disable the additional console window on Windows OS

Dec 4, 2022
Unlock Music Project - CLI Edition

Unlock Music Project - CLI Edition Original: Web Edition

Nov 2, 2022
Go library for searching on YouTube Music.

ytmusic Go library for searching on YouTube Music and getting other useful information. Installing go get github.com/raitonoberu/ytmusic Usage Search

Oct 15, 2022
A music programming language for musicians. :notes:

Installation | Docs | Changelog | Contributing composers chatting Alda is a text-based programming language for music composition. It allows you to co

Dec 30, 2022
Self-hosted music streaming server 🎶 with RESTful API and Web interface
Self-hosted music streaming server 🎶 with RESTful API and Web interface

Self-hosted music streaming server ?? with RESTful API and Web interface. Think of it as your very own Spotify!

Dec 27, 2022
Download and listen music in the terminal!
Download and listen music  in the terminal!

?? this cli still has a lot of bugs ?? A simple tool to download and listen music in the terminal. You will need: golang deno v1.16+ youtube-dl Instal

Dec 2, 2022
A tool coded by GO to decode cryptoed netease music files and qqmusic files

nqdumpgo A tool coded by GO to decode cryptoed netease music files and qqmusic files 一个使用 Go 语言编写的用于解密被网易云音乐或 QQ 音乐加密的文件的程序,Go 程序在拥有与 C++程序相近的效率的同时,大大

Dec 13, 2022
Small application to convert my music library folder structure to 'crates' in the open-source DJ software Mixxx

Small application to convert my music library folder structure to 'crates' in the open-source DJ software Mixxx

Nov 18, 2022
Gomu is intuitive, powerful CLI music player.
Gomu is intuitive, powerful CLI music player.

Gomu (Go Music Player) Gomu is intuitive, powerful CLI music player. It has embedded scripting language and event hook to enable user to customize the

Dec 25, 2022
Kwed-dl - A tool to download latest music files from remix.kwed.org

kwed-dl A small program to download latest tracks from remix.kwed.org. Keeps a counter in your home-folder (_kwedrc on windows and .kwedrc on linux).

May 24, 2022