R is an awesome application for data science, but it does have it’s limitations. Working with .wav files for instance, in my PhD, brought up some important issues. Early on, the most important of these was how to convert audio files obtained from industry sources into a format that played nicely with r. However, what I quickly found was that r was not up to the task.
List of packages and functions to convert audio files
R package | Function | Notes |
Av | av_audio_convert() | Requires uncompressed PCM .wav/.mp3 files |
Audio | load.wav() | Requires uncompressed PCM .wav /.mp3files |
tuneR | readWave() | Requires uncompressed PCM .wav/.mp3 files |
warbleR | read_wave() | Wrapper for tuneR’s readWave() |
As you can see, each function requires that the .wav file is in a specific format. However, what I quickly realised is that industry doesn’t necessarily format their recordings for the purposes of r-analysis, but rather for their own practical purposes. Furthermore, the formats favoured by industry tend to be compressed to conserve on storage and minimise expenses.
This is where I discovered SoX. SoX is pretty amazing and has rightly been called the Swiss-Army knife of sound processing programs.
What I found most compelling was the array of formats that SoX can handle:
Audio files that can be converted by SoX
|
|
Reading through the list, I think you’ll agree that it can handle most formats!
Installing SoX
The easiest way to download SoX is via a third party app called HomeBrew (a package manager specifically for made for Mac OS).
Chris Rosser provides a lovely overview of how to install SoX on a mac: https://chrisrosser.medium.com/using-sox-on-macos-48f25014d1e3
Here is a guide to install SoX on Windows 10 too: https://www.tutorialexample.com/a-step-guide-to-install-sox-sound-exchange-on-windows-10-python-tutorial/
That’s it! Now it’s time to have some fun with SoX.
How to convert audio files using sox r interface
Package seeWave in r has a lovely interface that translates SoX between r and the terminal. However, I found the script needed a lot of tweaking as the documentation wasn’t amazing.
The script is written as follows:
sox(command, exename = NULL, path2exe = NULL)
- command = is the SoX command that you want to use, and that you’d normally pop into the terminal interface. Generally it features an input file, a SoX command, and then an output file.
- exename = the name of the SoX binary file (in a Mac, the name “sox” is used by default, which happens to correspond with the program name when it installs.
- path2exe = the file address of the soX file
Where I ran into problems was with the actual command. What I found was that SoX doesn’t really like full file addresses for the input/output files, but rather likes just the name of the .wav file, provided it’s sitting at the top level of your filesystem.
I tend to use here() in r for myriad reasons (c.f. this post by Jenny Richmond: http://jenrichmond.rbind.io/post/how-to-use-the-here-package/ ).
By typing here(), r will reveal the top level of your files, so just make sure the sound file is located here.
# list the filenames of the .wav files in the folder filename <- list.files(here(“recordings”), pattern = “*.wav”, full.names = FALSE) #create a for loop to iterate through each .wav file for (I in 1:length(filenames) { #the input file obtained the folder recordings tmp1 <- paste0(“recordings”, “/”, filename[i]) #the output file put into the folder mod_rec tmp2 <- paste0(“mod_rec”, “/”, filename[i]) #the actual command, which has ‘-e’ to invoke the change format function, ‘float’ to specify the format to which you want to convert the input file. command <- paste(tmp1, “-e”, “float”, tmp2) #sox will then send this command over to the terminal for execution. sox(command) }
This simple script allowed me to easily convert around 500 compressed .wav recordings of average length 20 minutes in around 5 minutes.
I actually got SoX to do a couple of other things for me at the same time. Such as convert the sampling rate from 44100Hz down to 8000Hz (some readers may well bauk at this loss of information!), and to write only one channel. I did this via the following:
Command <- paste(tmp1, “-e”, “float”, “-r”, “8000”, “-c”, “1”, “tmp2”)
It’s pretty easy to see that the ‘-r’ command converts the recording to 8kHz, while the ‘-c’ command instructs the terminal to write only a single channel.
SoX can do a wealth of other things too. Here is a ‘cheat sheet’ https://gist.github.com/ideoforms/d64143e2bad16b18de6e97b91de494fd
Some other things SoX can do is to combine different audio files, modify an audio file with gain, normalise, trim and equalise. However, I’d argue that r can do those things natively and a little more transparently for my liking.
Conclusion
Should you come into possession of some audio files written in a variety of arcane (but highly useful formats to someone!) you need look no further than SoX. I consider it a god send when you need to analyse a variety of audio files using statistical tools like r.
Greetings! Very useful advice in this particular article! Its the little changes that will make the largest changes. Thanks a lot for sharing!
Hi there – this looks really useful, and I’d like to modify the code for my own use case (batch converting wav to flac). When I run the following script no flac file is created, and I’m just wondering if there’s anything I would need to modify from your code to do that?
Here’s the code I’m running:
library(seewave)
library(here)
# synthesis of a 1kHz sound
b<-synth(d=10,f=8000,cf=1000)
# save it as a .wav file in the default working directory
savewav(b,f=8000, file = here::here('recordings/example.wav'))
# list the filenames of the .wav files in the folder
filenames <- list.files(here('recordings'), pattern = '*.wav', full.names = FALSE)
#create a for loop to iterate through each .wav file
for (i in 1:length(filenames)) {
#the input file obtained the folder recordings
tmp1 <- paste0("recordings", "/", filenames[i])
#the output file put into the folder mod_rec
tmp2 <- paste0("mod_rec", "/", filenames[i])
# the actual command, which has ‘-e’ to invoke the change format function, ‘float’ to specify the format to which you want to convert the input file.
command <- paste(tmp1, "-e", "float", tmp2)
# sox will then send this command over to the terminal for execution.
sox(command)
}
Thanks!
Jay
Nevermind, the above works as long as I set the path to Sox in the sox() function:
sox(command, path2exe = “C:/Program Files (x86)/sox-14-4-2”)
…and change ‘tmp2’ to end in .flac.
Thanks!