June | 2010 | Marwalk's Blog

FredPod 100531 soX audio utility

June 1, 2010

This program was produced on May 31, 2010. And today’s topics will include:

Selections from “yum info recent” on the Fedora Linux project.

Today’s feature is SoX, as in “Sound eXchange,” the Swiss Army knife of audio manipulation. Included will be a demo of the sox “tempo” feature.

And we’ll close with a Creative Commons licensed work by

Terri England.

FredPod by Mark Walker is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License

Here are some recently updated items in yum at the Fedora project:

Name : gpointing-device-settings

Summary : Configuration tool for pointing devices

Description: GUI tool for setting pointing devices such as TrackPoint or

: Touchpad. It allows configuring of various driver parameters on

: the fly. It is a successor of GSynaptics.

Name : ruby

Summary : An interpreter of object-oriented scripting language

Description: Ruby is the interpreted scripting language for quick and easy

: object-oriented programming. It has many features to process text

: files and to do system management tasks (as in Perl). It is

: simple, straight-forward, and extensible.

Name : vidalia (think onion routing here)

Summary : GUI controller for the Tor Onion Routing Network

Description: Vidalia is a cross-platform controller GUI for Tor, built using the

: Qt framework. Vidalia allows you to start and stop Tor, view the

: status of Tor at a glance, and monitor Tor’s bandwidth usage.

: Vidalia also makes it easy to contribute to the Tor network by

: helping you setup a Tor server, if you wish.

Today’s feature item is sox, as in Sound eXchange, a universal sound sample translator; also known as the Swiss Army knife of audio manipulation. The beauty of sox is that it’s a command line program. Yes you can do some things with sound better on the command line. And that is one thing that makes sox an excellent complement to GUI applications such as Audacity.

The sox man pages, written by Chris Bagwell and others, provide clear and easy-to-follow commands that can have you converting audio files in minutes. Later I’ll demonstrate the “tempo” function of sox so you can hear an example of its utility.

So let’s start by following the man, as in the man pages for sox. SoX reads and writes audio files in most popular formats and can optionally apply effects to them; it can combine multiple input sources, synthesize audio, and, on many systems, act as a general purpose audio player or a multi-track audio recorder. It also has limited ability to split the input in to multiple output files.

Almost all SoX functionality is available using just the sox command, however, to simplify playing and recording audio, if SoX is invoked as play the output file is automatically set to be the default sound device and if invoked as rec the default sound device is used as an

input source. Additionally, the soxi command provides a convenient way to just query audio file header information. Soxi is the command for the Sound eXchange Information, utility, which displays sound file metadata.

The heart of SoX is a library called libSoX. Those interested in extending SoX or using it in other programs should refer to the libSoX man page.

As a command-line audio processing tool, sox is particularly suited to making quick, simple edits and to batch processing. If you need an interactive, graphical audio editor, you can always use audacity.

The overall SoX processing chain can be summarized as follows:

Input(s) → Combiner → Effects → Output(s)

To show how this works in practice, here is a selection of examples of

how SoX might be used. The simple

sox recital.au recital.wav

translates an audio file in Sun AU format to a Microsoft WAV file, while

sox recital.au -r 12k -b 8 -c 1 recital.wav vol 0.7 dither

performs the same format translation, but also changes the audio sampling rate
(the -r 12k) & sample size (the -b 8), down-mixes to mono (the -c 1), and applies the vol and dither effects.

sox -r 8k -u -b 8 -c 1 voice-memo.raw voice-memo.wav

converts ‘raw’ (a.k.a. ‘headerless’) audio to a self-descibing file

format,

sox slow.aiff fixed.aiff speed 1.027

adjusts audio speed,

sox short.au long.au longer.au

concatenates two audio files, and

sox -m music.mp3 voice.wav mixed.flac

mixes together two audio files.

My observation from just this short list of many rich options in sox, is that the switches are very intuitive. For example:

-m to mix and combine separate files,

-r to change the sample rate,

-b to change the sample bit size, perhaps call that “aural resolution,”

-c to change the number of channels, and

speed to change the pitch and tempo together, just like speeding up a mechanical tape recorder.

Are you lovin’ it yet?

The sox utilities also include play and rec. For example, you can use wildcard filenames, such as *.ogg, to listen to several files. The command

play “My music/*.ogg” bass +3

plays a collection of audio files while applying a bass boosting effect. The word “bass” is spelled b-a-s-s in this case.

There are many more advanced features in sox, such as synthesizing sounds, and manipulating encoding as signed integer, unsigned integer, floating point, and other advance audio processing options. Even so, explicitly specifying other encoding types (such as MP3 or FLAC) is not necessary since they can be inferred from the file type or header.

So let’s get on with the demo. This will use the tempo option in sox. This changes the audio tempo (but not its pitch). The audio is

chopped up into segments which are then shifted in the time

domain and overlapped and cross-faded at points where their wave-

forms are most similar (as determined by measurement of ‘least

squares’).

The syntax for the tempo feature is

sox input-file output-file tempo [factor]

The factor is a decimal number, that gives the ratio of new tempo to the old tempo.

If the factor value is greater than one, the audio will speed up and end earlier than the original. If the factor value is less than one, the audio will slow down and end later than the original.

In the FredPod ringtone file I have, FPRT.flac, the audio is 10.7 seconds long, as follows:

FPRT.flac

By applying the command

sox FPRT.flac FPRTslow.flac tempo 0.8

the audio is extended to 13.4 seconds, as follows:

FPRTslow.flac

Now, by applying the command

sox FPRT.flac FPRTfast.flac tempo 1.2

the audio is shorted to 8.9 seconds, as follows:

FPRTfast.flac

There are a couple of fine points noted in the sox man pages.

SoX converts all audio files to an internal uncompressed format

before performing any audio processing; this means that manipulating a

file that is stored in a lossy format can cause further losses in audio

fidelity. For example, with the command

sox long.mp3 short.mp3 trim 10

SoX first decompresses the input MP3 file, then applies the trim

effect, and finally creates the output MP3 file by re-compressing the

audio – with a possible reduction in fidelity above that which occurred

when the input file was created. Hence, if what is ultimately desired

is lossily compressed audio, it’s highly recommended to perform all

audio processing using lossless file formats and then convert to the

lossy format only at the final stage. I like to work with flac files, or at least wav files, if at all possible before exporting out to mp3 and ogg.

Also it’s worth knowing that:

Applying multiple effects with a single SoX invocation will, in

general, produce more accurate results than those produced using multiple SoX invocations; So piling your command line high with all the toppings at once is also recommended with sox.

And now that we’ve gone over all that, it’s time to make some music.

Today’s podcast will close with a Creative Commons licensed work by Terri England entitled “Spring,“ from Mevio’s Music Alley. Check it out at ‘music.mevio.com'”

Enjoy.

Posted in Podcast | Leave a Comment »

Marwalk’s Blog

FredPod 100531 soX audio utility

June 1, 2010

Pages

Categories

Archives

Blogroll