Select Page

< Back to blog

Using SoX to convert audio files in r

by | Apr 1, 2022 | Category 1 | 3 comments

R is an awesome application for data science, but it does have it’s limitations. Working with .wav files for instance, in my PhD, brought up some important issues. Early on, the most important of these was how to convert audio files obtained from industry sources into a format that played nicely with r. However, what I quickly found was that r was not up to the task.

convert audio files

List of packages and functions to convert audio files

 

R package Function Notes
Av av_audio_convert() Requires uncompressed PCM .wav/.mp3 files
Audio load.wav() Requires uncompressed PCM .wav /.mp3files
tuneR readWave() Requires uncompressed PCM .wav/.mp3 files
warbleR read_wave() Wrapper for tuneR’s readWave()

 

As you can see, each function requires that the .wav file is in a specific format. However, what I quickly realised is that industry doesn’t necessarily format their recordings for the purposes of r-analysis, but rather for their own practical purposes. Furthermore, the formats favoured by industry tend to be compressed to conserve on storage and minimise expenses.

This is where I discovered SoX. SoX is pretty amazing and has rightly been called the Swiss-Army knife of sound processing programs.

What I found most compelling was the array of formats that SoX can handle:

Audio files that can be converted by SoX

  • Raw files in various binary formats
  • Raw textual data
  • Amiga 8svx files
  • Apple/SGI AIFF files
  • SUN .au files
    • PCM, u-law, A-law
    • G7xx ADPCM files (read only)
    • mutant DEC .au files
    • NeXT .snd files
  • AVR files
  • CDDA (Compact Disc Digital Audio format)
  • CVS and VMS files (continuous variable slope)
  • Grandstream ring-tone files
  • GSM files
  • HTK files
  • LPC-10 files
  • Macintosh HCOM files
  • Amiga MAUD files
  • AMR-WB & AMR-NB (with optional libamrwb & libamrnb libraries)
  • MP2/MP3 (with optional libmad, libtwolame and libmp3lame libraries)
  • MP4, AAC, AC3, WAVPACK, AMR-NB files (with optional ffmpeg library)
  • AVI, WMV, Ogg Theora, MPEG video files (with optional ffmpeg library)
  • Ogg Vorbis files (with optional Ogg Vorbis libraries)
  • FLAC files (with optional libFLAC)
  • IRCAM SoundFile files
  • NIST SPHERE files
  • Turtle beach SampleVision files
  • Sounder & Soundtool (DOS) files
  • Yamaha TX-16W sampler files
  • SoundBlaster .VOC files
  • Dialogic/OKI ADPCM files (.VOX)
  • Microsoft .WAV files
    • PCM, floating point
    • u-law, A-law, MS ADPCM, IMA (DMI) ADPCM
    • GSM
    • RIFX (big endian)
  • WavPack files (with optional libwavpack library)
  • Psion (palmtop) A-law WVE files and Record voice notes
  • Maxis XA Audio files
    • EA ADPCM (read support only, for now)
  • Pseudo formats that allow direct playing/recording from most audio devices
  • The “null” pseudo-file that reads and writes from/to nowhere

Reading through the list, I think you’ll agree that it can handle most formats!

Installing SoX

The easiest way to download SoX is via a third party app called HomeBrew (a package manager specifically for made for Mac OS).

Chris Rosser provides a lovely overview of how to install SoX on a mac: https://chrisrosser.medium.com/using-sox-on-macos-48f25014d1e3

Here is a guide to install SoX on Windows 10 too: https://www.tutorialexample.com/a-step-guide-to-install-sox-sound-exchange-on-windows-10-python-tutorial/

 

That’s it! Now it’s time to have some fun with SoX.

How to convert audio files using sox r interface

Package seeWave in r has a lovely interface that translates SoX between r and the terminal. However, I found the script needed a lot of tweaking as the documentation wasn’t amazing.

The script is written as follows:

sox(command, exename = NULL, path2exe = NULL)

  • command = is the SoX command that you want to use, and that you’d normally pop into the terminal interface. Generally it features an input file, a SoX command, and then an output file.
  • exename = the name of the SoX binary file (in a Mac, the name “sox” is used by default, which happens to correspond with the program name when it installs.
  • path2exe = the file address of the soX file

Where I ran into problems was with the actual command. What I found was that SoX doesn’t really like full file addresses for the input/output files, but rather likes just the name of the .wav file, provided it’s sitting at the top level of your filesystem.

I tend to use here() in r for myriad reasons (c.f. this post by Jenny Richmond: http://jenrichmond.rbind.io/post/how-to-use-the-here-package/ ).

By typing here(), r will reveal the top level of your files, so just make sure the sound file is located here.

# list the filenames of the .wav files in the folder 
filename <- list.files(here(“recordings”), pattern = “*.wav”, full.names = FALSE)   

#create a for loop to iterate through each .wav file 

for (I in 1:length(filenames) 
{ 
#the input file obtained the folder recordings 
tmp1 <- paste0(“recordings”, “/”, filename[i]) 

#the output file put into the folder mod_rec 

tmp2 <- paste0(“mod_rec”, “/”, filename[i])  
 #the actual command, which has ‘-e’ to invoke the change format function, ‘float’ to specify the format to which you want to convert the input file. 

command <- paste(tmp1, “-e”, “float”, tmp2) 
#sox will then send this command over to the terminal for execution. 

sox(command) 

}

 

This simple script allowed me to easily convert around 500 compressed .wav recordings of average length 20 minutes in around  5 minutes.

I actually got SoX to do a couple of other things for me at the same time. Such as convert the sampling rate from 44100Hz down to 8000Hz (some readers may well bauk at this loss of information!), and to write only one channel. I did this via the following:

Command <- paste(tmp1, “-e”, “float”, “-r”, “8000”, “-c”, “1”, “tmp2”)

It’s pretty easy to see that the ‘-r’ command converts the recording to 8kHz, while the ‘-c’ command instructs the terminal to write only a single channel.

SoX can do a wealth of other things too. Here is a ‘cheat sheet’ https://gist.github.com/ideoforms/d64143e2bad16b18de6e97b91de494fd

Some other things SoX can do is to combine different audio files, modify an audio file with gain, normalise, trim and equalise. However, I’d argue that r can do those things natively and a little more transparently for my liking.

Conclusion

Should you come into possession of some audio files written in a variety of arcane (but highly useful formats to someone!) you need look no further than SoX. I consider it a god send when you need to analyse a variety of audio files using statistical tools like r.

3 Comments

  1. נערות ליווי במרכז

    Greetings! Very useful advice in this particular article! Its the little changes that will make the largest changes. Thanks a lot for sharing!

    Reply
  2. Jay Winiarski

    Hi there – this looks really useful, and I’d like to modify the code for my own use case (batch converting wav to flac). When I run the following script no flac file is created, and I’m just wondering if there’s anything I would need to modify from your code to do that?

    Here’s the code I’m running:

    library(seewave)
    library(here)

    # synthesis of a 1kHz sound
    b<-synth(d=10,f=8000,cf=1000)
    # save it as a .wav file in the default working directory
    savewav(b,f=8000, file = here::here('recordings/example.wav'))

    # list the filenames of the .wav files in the folder
    filenames <- list.files(here('recordings'), pattern = '*.wav', full.names = FALSE)

    #create a for loop to iterate through each .wav file

    for (i in 1:length(filenames)) {

    #the input file obtained the folder recordings
    tmp1 <- paste0("recordings", "/", filenames[i])

    #the output file put into the folder mod_rec

    tmp2 <- paste0("mod_rec", "/", filenames[i])
    # the actual command, which has ‘-e’ to invoke the change format function, ‘float’ to specify the format to which you want to convert the input file.

    command <- paste(tmp1, "-e", "float", tmp2)
    # sox will then send this command over to the terminal for execution.

    sox(command)

    }

    Thanks!

    Jay

    Reply
    • Jay Winiarski

      Nevermind, the above works as long as I set the path to Sox in the sox() function:

      sox(command, path2exe = “C:/Program Files (x86)/sox-14-4-2”)

      …and change ‘tmp2’ to end in .flac.

      Thanks!

      Reply

Submit a Comment

Your email address will not be published. Required fields are marked *

Similar articles

No Results Found

The page you requested could not be found. Try refining your search, or use the navigation above to locate the post.