In a previous article, I described a method for detecting chords in an audio file (also available for Scala). Continuing on this theme, the following will find the onset of a drumbeat in a file, using R. I’m using a single drumstick click, which you can hear on freesound.org.
This method detects sudden volume increases- it is not made to respond to changes in pitch or timbre (i.e. a song that marks the beat by changing pitch or switching instruments, respectively). However, the methods for doing this seem to be based on the method described below.
We’re looking for the onset of the drumbeat- where the anticipation starts. From reading literature, it appears that this is believed to be what we perceive as the beat in music, rather than, say, the loudest point.
Load the file into memory:
library(sound) file<-'432__tictacshutup__prac-perc-4.wav' sample<-loadSample(file) fourbeats<-appendSample(sample, sample, sample, sample) saveSample(fourbeats, "out\\fourbeats.wav") eightbeats<-appendSample(sample, sample, sample, sample, sample, sample, sample, sample) saveSample(eightbeats, "out\\eightbeats.wav")
Next, define the first order differential function. (There is a method in R called diff, which is essentially the same)
firstOrderDiff<-function(x, lag){ x[(1+lag):length(x) - x[1:(length(x)-lag)]] } wav<-sample$sound plot(1:length(wav), abs(wav), type="l")
Clearly there is a lot going on- for the sake of example, let's zoom in:
begin<-abs(wav[1:2000]) plot(1:length(begin), abs(begin), type="l")
We're really interested in magnitude (loudness) of the sound - it's much easier to work with if you take the absolute value.
Still, there are a lot of peaks and valleys. The first order differential is approximately the derivative, and can be computed over a range (e.g. sample 99 - sample 0, sample 100 - sample 1, etc), but experimentally this seems unstable. Instead, I compute the rolling mean over a small sample, then compute the derivative. This is part of the value in working with only positive numbers, as rolling mean is useless on alternating negative and positive numbers.
library(zoo) smoothed<-rollmean(begin, 100) plot(abs(smoothed), type="l")
And for the key, find the max value of the derivative to determine where the sound rises fastest:
start<-which.max(firstOrderDiff(begin, 100)) abline(v=start)
In the future I will describe how to generalize this for finding each beat, and handling more types of music.
If you're interested in R (the statistical programming language), check out my review of the R Cookbook. You may also be interested in The Scientist & Engineer's Guide to Digital Signal Processing