My Profile Photo

Deviant Syndrome


coding, multimedia, gamedev, engineering


Docker Audio Hell Pt. 1

And The Passive Amoeba of Linux Audio

heck

In this article, we are going to quickly reiterate the reason why audio-production on Linux can be challenging (if not frustrating at times). I think, the main reason for that, besides of course, my own incompetence, is audio-facilties being handled in too much of Unix-way. But what is Unix way, anyway? Well, in short, we have one isolated piece of software doing one single thing, but doing well.If any additional processing needs to be done with the output, we pipe (|) it into another isolated piece of software. Several steps of that form a pipeline. Sounds great, right? Well, unfortunately, sometimes the seeming tidiness of this approach could quikcly go down the tubes (pun intended). Once you need something specific, or something simply done your way, you’ll have to explore the whole pipeline from end to end. It’s alright, if the nodes there are simple utilities like cat or grep, but imaging exploring each of those having massive tectonic layers of legacy software archeology above them.

So, here is how this usually happens. It all starts with ALSA, a kernel subsystem, which works directly with your audio hardware. It does not efficiently handle software mixing and routing, though, so applications cannot share your device inputs/outputs effieicently, and we were not even talking about routing audio inputs and outputs between them. To satisfy those needs, PulseAudio was introduced. A software wrapper around ALSA, that re-routes all the audio streams through itself, and distributes it between existing hardware inputs and outputs (sinks). Applications compiled to use PulseAuido as audio-driver cannot use ALSA directly, all according to the hightest standards of incomprehensible madman’s logic we maintain in the world of prograaming.

PulseAudio, however had some issues with latency, which made it hardly usable for “professional” audio-recording, that usually includes near real-time record monitoring, for example. To solve problem, purposedly introdued, we wrote an alternative to PulseAudio called JACK. Does this remind you of “The Futurological Congress”? For me it totally does. For a long time, JACK was the standard for (semi)professional audio work on Linux. Apparently, does not play nicely with PulseAudio. Some applications, (ex: Wine), does not support JACK at all. Of course, Wine has WineASIO, which can be routed to JACK, however, it is possible only for applications that use ASIO for audio, i.e. DAWs.

Both JACK and Pulseaudio, have different solutions for audio-over network. PulseAudio has native client-server support, and also , a special audio-sink for streaming audio in RTP protocol (is it the same thing as zeroconf or not?). Judging by these configuration strings, client-server support of pulseaudio, can use TCP as a transport, and something called “native”, which I presume, would be standard unix-sockets.

heck

JACK provides several ways remote audio networking. It’s native netone addon using CELT codec and master/follower pattern, and some newer netJACK2, which has network discovery.

Lately, yet another ultra-low-latency professional-audio-grade Unix soundsystem was introduced. It’s called PipeWire, and it is sort of a chamenion protocol, which can act as a PulseAudio backend for PulseAudio clients, Jack backend to Jack clients and so on. AFAIK, Pipewire does not have audio-over-network support.

NOTE: Some amoebas are not shown on the figures, simply because they are considered extinct. Before ALSA, earlier versions of Linux were shipped with Open Sound System (OSS), which then was appropriated and no longer counts as free software. However, as we know, once introduced at the kernel level, is destined to be supported for eternity. That’s why ALSA still has an OSS emulation mode.