Wednesday, February 12, 2014

Convert MP3 to a scrolling spectrum waterfall plot video

There are many utilities that display a scrolling spectrum waterfall plot [1] from a signal, but I was unable to find any open source utility that converted an audio file into a video file with a scrolling waterfall plot + the audio.

The SoX [2] sound utility can generate a static spectrum waterfall plot image from an audio file (or part of it), but it can't make a video. So I wrote a script to do this.

It's a very brute force approach. It requires lots of CPU time and lots of temporary disk space. The script depends on SoX, GNU Parallel, mencoder (or ffmpeg). GNU Parallel is optional, but will result in significant speed up on a multi-core system.

To use, create an empty directory on a volume with plenty of disk space and run with the audio file as a parameter.  Eg:

./make-spectrogram-video.sh -t "My Music File" mymusic.mp3

The output will be written to output.avi and output.mp4. Other options include setting frame rate, the speed of the scrolling, audio credit text etc. To get full help do this:

./make-spectrogram-video.sh -h

The script is available on GitHub here [2]. Here is a sample output video of Bach's Toccata and Fugue in D Minor [3] :



Updates:

17 Feb 2014: I noticed that all the interesting details in music is squashed down at the very bottom of the spectrogram. So I updated the spectrogram module in SoX to have the option of plotting on a log axis. This isn't in the offical sox distribution. See this blog post for more details [4].

Footnotes:

[1] http://en.wikipedia.org/wiki/Waterfall_plot

[2] https://github.com/jdesbonnet/audio-to-waterfall-plot-video

[3] Music MP3 file from https://archive.org/details/ToccataAndFugueInDMinor. YouTube video at http://www.youtube.com/watch?v=utp95bprqeg

[4] http://jdesbonnet.blogspot.ie/2014/02/sox-spectrogram-log-frequency-axis-and.html

3 comments:

Nande! said...

interesting, but you probably could feed the png directly to ffmpeg

Peter said...

Hi there.

Thanks for your script.
IU got the following errors when trying it out: "[marmotte@arch ~]$ sh /usr/bin/make-waterfall-video.sh ~/Desktop/wheeze.mp3
/usr/bin/make-waterfall-video.sh: line 181: shift: -1: shift count out of range
TITLE=
CREDIT=https://github.com/jdesbonnet/audio-to-waterfall-plot-video
MP3_FILE=/home/marmotte/Desktop/wheeze.mp3
SPECTROGRAM_WIDTH=1 seconds
FPS=30 frames/second
OUTPUT_FILE=output.mp4
/usr/bin/make-waterfall-video.sh: line 198: mp3info: command not found
/usr/bin/make-waterfall-video.sh: line 207: sox: command not found
(standard_in) 1: syntax error
Title (top):
Credit (bottom/left): https://github.com/jdesbonnet/audio-to-waterfall-plot-video
Number of frames to generate:
/usr/bin/make-waterfall-video.sh: line 215: *185/1000: syntax error: operand expected (error token is "*185/1000")
"

ANy idea?

Thanks again

Joe Desbonnet said...

Looks like mp3info and sox are missing (I should have listed those as dependencies). On Debian style distributions (like Ubuntu) try this:

sudo apt-get install sox mp3info

... and try again. If it doesn't work can you contact me directly... jdesbonnet at gmail dot com.