Tag Archives: Open Source

Spectrogram Videos of Audio Files

This is a short article on creating video spectrograms (time-frequency plots) of audio files. The work comes from research project, Soundscapes of Socioecological Succession, funded by a Center for Environmental Futures, Andrew W. Mellon 2021 Summer Faculty Research Award.

The example in Video 1 is a spectrogram video created using Matlab. The audio is a recording of a small dynamite blast of a 70″ stump across from Eagle Rock, just past Eagle Rock Lodge on the McKenzie Hwy in Vida, OR.

Video 1. Video of spectrogram with playback barline and synchronized audio file.

I love spectrograms. I’ve worked with time-frequency plots in various ways in the past, namely spectral smoothing music (listen on Spotify), collaborative research (read the paper), and even teaching (Data Sonification course) at the University of Oregon. Yet, I am still amazed by the work of spectrograms and sound in the sciences. I knew of the theories around animals occupying various frequency spaces within a habitat based upon the bioacoustics work of Garth Paine and great multimedia reporting by Andreas von Bubnoff. Yet, after an interview with a UO visiting researcher, Ines Moran, as part of our Soundscapes of Socioecological Succession project, I was further intrigued by how sound, spectrograms, and AI plays an integral role in her bioacoustics research on bird communication.

This led me to revisit my work with spectrograms. I was blown away by Merlin ID’s auto spectrogram video app, and I wanted to relook at how I create my own spectrogram videos. I’ve been frustrated with multiple software solutions to generating scrolling spectrogram videos. Not having a seamless solution other than using screen capture on iZotope RX or Audacity spectrograms, I did some more research looking at iAnalyse 5 software (replaces eAnalysis software) and Cornell Lab’s RavenLite software, but was unsatisfied with movie export results. I appreciated the zoom functionality of each software but wanted auto-chunking or scrolling of the spectrogram within a high-resolution video.

I didn’t easily discover a straightforward plug n’ play solution (although I’m open to hearing one if you have a suggestion!). I ended up going back to Matlab to see if I could find a pre-existing library or code I could implement. I found slightly different versions, and not exactly seamless. I ended up refashioning some pre-existing code written by Theodoros Giannakopoulos that generated gifs from spectrograms. See gif Figure 1.

Figure 1. Original Gif export using pre-existing Matlab code.

I used this code as a starter for me to build out the function to export videos of spectrograms, and which I can specify the length in seconds for each window. Video 2 depicts example display output of audio waveform and the spectrogram of a Swainson’s Thrush bird call. I sync’ed the audio afterward in Adobe Premiere. I removed the waveform to focus on the spectrogram, and I had to get fancy on x-axis labels to dynamically match the length of windows that could be any length of seconds.

Video 2. Video output on a single screen, with split waveform and spectrogram view.

While I was unable to get a scrolling spectrogram video in one software, the auto-chunking feature was quite time-saving. I simply crafted an Adobe Premiere template with a scrolling animation graphic that I can easily edit to equal the exact window length and sync my original audio file to the movie. All within about a minute or two (see Figure 2). The final version has a nice scrolling playback bar on pages of spectrogram videos.

Figure 2. Screenshot of Adobe Premiere with line graphic that keyframe animates across the spectrogram during playback.

Video 3 displays the spectrogram complete with audio waveform, audio file, and playback barline (audio and playback barline added in Adobe Premiere).

Video 3. Video example with scrolling playback barline

Video 4 shows the final version of the code output after removing the audio waveform, resizing the graph, and updating the title. Again, adding the playback barline and synchronizing the audio were done in Adobe Premiere.

Video 4. Final version of Matlab code that generates a 1920x1080p spectrogram video the same length as the audio file.

The code gave me an easy way to label the spectrogram and embed this in the video. There are four steps.

1. Run the script in Matlab which outputs the 1920×1080 video and contains the same length as the audio file,

2. Drag the video into Adobe Premiere with the Graphics playback bar template

3. Drag the audio into the start to match the animation

4. Export the 1920×1080 video.

The process for one audio file takes about 2-3 minutes from start to finish.

I could make this more dynamic by grabbing the audio file length automatically and setting the frame rate automatically to match. simply determine how many “screen/pages” I want by editing the function variables.

***For those interested in the Matlab code, I have made it publicly accessible as a repository on Github.

References / Resources

Inspiration for the work was after an interview with Ines G. Moran, visiting scholar at the University of Oregon, who works in wildlife bioacoustics (website).

Max/MSP/Jitter Abstractions

Below are just some of the documented resources I’ve made for Max/MSP. Hopefully these Max abstractions help save you time within this wonderful programming environment.

Jitter matrix grid – creates a grid of any size columns and rows (stored inside a coll) for controlling a Jitter matrix.

Tempo Control – interface for controlling tempo in Max. User can control tempo with bpm or millisecond. Includes a tap tempo.

drop folder – dropping a folder of files will automatically place files into a umenu object. Great for buffer/groove objects.

for loop – performs an arithmetic ‘for loop’ in Max/MSP. Sometimes line programming can be much easier than graphical.

data as table – displays data in a table as it is received, so you can graphically see the values of incoming data over the course of time.

MIDI Drum umenu – standard channel 10 MIDI drums list saved conveniently into a umenu (culled from Apple’s basic MIDI synthesizer)

MIDI Control – modular design for controlling MIDI volume, program changes, and makenote in Max. Each function is inside its own patcher object.

toggle message – toggles input of any message (number, message, bang) between two outputs.

modulo bang – user controls up/down integer count with modulo control when only bangs are available.