Being able to capture sound, store it and play it over and over again never fails to leave me in awe of its pioneers, from Thomas Edison to Alan Blumlein, the British electrical engineer who, in 1931, invented ‘binaural recording’ – what we now call ‘stereo’. (Never heard of him? Blumlein amassed 128 patents in audio, radar and television that are still in use today, but tragically, was killed in a plane crash during World War II while testing airborne radar. His loss was considered so great, news of his death was kept secret until after the war).
So far in this series, we’ve turned an Arduino into a number of audio-related projects from a digital audio player to, most recently, an audio spectrum analyser. This month, we see just how far we can push the popular microcontroller as we begin from scratch turning it into a basic but working digital audio recorder.
How it works
No matter what they look like, all digital audio recording devices have to perform the same basic functions – they have to capture an analog audio signal at regular intervals while at the same time, saving the digital data to storage. That might be easy for a PC or smartphone, but we’ll need to introduce some advanced programming techniques plus tap into some hidden features to get this working on an Arduino Uno.
For the record, our Digital Audio Recorder will capture a single (mono) analog audio channel with a sample rate of 22.05kHz, 8-bit sample depth and store it as a Windows WAV file with up to 4GB filesize on a microSD flash card. Now before you yawn in excitement at those specs, remember, we’re doing this with a 16MHz processor, just 2KB of RAM and 32KB of programming space. If only a Windows PC could be so efficient!
To help make the project (and source code) as easy to understand as possible, our recorder has just two buttons – record and stop. It doesn’t play audio and only records to a single fixed file in the root folder of the flash card called ‘REC00000.WAV’. An existing file with the same name will be overwritten. For playback, just take the flash card, load it into your PC, phone or tablet and play the file in any standard WAV file-ready media player or editor.
We all know digital audio – we all listen to music and we’ve all no doubt ripped a few CDs in our time. But how do we capture an analog signal and turn it into digital audio?
This is where the work of another electrical engineer, Harry Nyquist, helps us out. He figured out that in order to digitally capture an analog signal, we need to capture or ‘sample’ it at regular intervals, a rate which needs to be at least twice the highest audio frequency we want to capture. That means if we want a 5kHz audio bandwidth for example, we need a minimum 10kHz sample rate.
The way we capture those samples is with a circuit device called an analog-to-digital converter (ADC) and the Arduino Uno’s ATMEGA328P microcontroller chip has one on-board. But by default, it has a 9.6kHz sample rate and 10-bit sample depth, so we already have work to do to knock it into shape. For starters, the sample rate is too low (we’d only get a 4.8kHz audio bandwidth, which is AM radio quality at best) and the bit depth is the wrong size. (Bit depth is the sample precision, which is normally 16-bit in CD audio, but the Arduino’s ADC only has 10-bits to start with).
Still, the ATMEGA328P has a few tricks up its sleeve. One of those we used in the Audio Spectrum Analyser project is an adjustable ADC clock prescaler. Just like any CPU, the ADC requires a clock signal to synchronise its function and here, this is set by a programmable divider or ‘prescaler’, dividing the 16MHz Arduino master clock by a default factor of 128 to create a 125kHz ADC clock rate.
Because the ADC uses the ‘successive approximation’ sampling method (we looked at this in detail a few months ago), each sample takes 13 clock cycles, giving us a sample rate of 125kHz/13 or approximately 9.6kHz. But if we reduce the prescale factor, we can increase the ADC clock rate – a prescaler factor of 16 immediately cranks up the sample rate to nearly 77kHz, or a sample every 13microseconds.
But there is a downside – the ADC loses sample accuracy with increased clock speed, however, even at this higher rate, the accuracy is still close to 8-bit, which is all we need. This ‘overclocking’ technique works extremely well, but there is one other major limitation – the limited number of prescaler settings leaves us with audio-unfriendly samples rates of 9.6, 19.2, 38.4 and 76.8kHz, none of which are WAV format-standard.
Read through the ATMEGA328P datasheet and you’ll find that in addition to the normal ‘free-running’ sample mode we’ve been talking about, the ADC also has a ‘single conversion’ mode, whereby the ADC is enabled by setting the sampling register bit or ‘flag’, it grabs that sample and immediately resets the flag when the sample is ready for processing.
That mightn’t sound like cause for celebration, but when we combine it with another of the ATMEGA328P’s hidden talents called ‘timer interrupts’, we now have a mechanism for setting a much more precise sample rate.
In computer architecture, an ‘interrupt’ is a trigger to tell the processor to immediately divert from or ‘interrupt’ the current process and run a specific task associated with that interrupt. Once the new task is completed, the processor returns to the original process and picks up where it left off.
Now, the ATMEGA328P has all sorts of interrupt triggers to play with – you can trigger an interrupt externally by pulling an interrupt pin high or low as appropriate, but the chip also has a number of software-controlled options, one set in particular called ‘timer interrupts’.
In any CPU or microcontroller, a timer is just a hardware variable or ‘register’ that counts up to its maximum count (for example, 256 for an 8-bit timer), instantly drops back to zero and starts again. Because timers run off the master clock and each count takes a fixed number of clock cycles, we can programmatically figure out how long it will take to reach the top count, hence the ‘timer’ name.
The ATMEGA328P has three of them – one 16-bit and two 8-bit timers – along with different ways you can use them. One simple way is once the timer reaches its maximum count, it can set an ‘overflow’ flag, which can be used to trigger an interrupt. Like the ADC, timers also have a programmable prescaler for the input clock, so we can adjust how long it takes to reach that overflow condition.
But another far more useful option is a special mode called ‘Clear Timer on Compare Match’ or CTC. Instead of waiting for the timer to reach its overflow point, we can choose our own – for example, rather than wait for an 8-bit timer to count to 256, we can load any number between 1 and 255 into a special ‘compare’ register and once the timer reaches that number, it triggers an interrupt, the timer instantly reverts to zero and counts again. Using this technique to drive the ADC sampling, we can set the sample rate with far greater precision.
In our project, we use the chip’s ‘Timer2’ timer, switch it to this CTC mode and load an 8-bit register called ‘OCR2A’ with our ‘compare’ number to give us an interrupt every 45microseconds. That gives us a sample rate of approximately 22.19kHz – not perfect, but closer than anything else.
Storing the samples
However, now we have these 10-bit samples turning up every 45 microseconds, we’ve got to do something with them. The first thing is to drop the two least significant bits (LSBs) and turn them into 8-bit samples – that’s relatively easy since each 10-bit sample is split and stored in two 8-bit registers ADCL and ADCH. We program the ADC control register ADCSRA to give us just the top eight bits in the ADCH register.
But with only 2KB of RAM on-board (closer to 1KB by the time we run our code), the ATMEGA328P will still run out of space in a heart-beat – that’s where the microSD card module comes in.
We all know SecureDigital (SD) flash cards – they’re in everything from cameras to phones and tablets. We load them into our PCs and they appear as yet another storage drive. But when it comes to low-level hardware design like this, we need to understand a lot more about these tiny little storage devices.
All SD cards normally use a four-bit wide parallel interface to transfer data and achieve average write speeds beyond 10MB/second. But the key word is ‘average’ – the actual write speed can vary considerably, depending on the latency or delay in writing data to the card.
Normally, your PC or device has sufficient RAM to buffer the data and smooth out the writing process so there’s no perceived loss in write speed. But the ATMEGA328P has two things working against it – it only has 2KB of RAM, but more importantly, it doesn’t have a 4-bit data interface. Instead, we have to use the one-bit Serial Peripheral Interface (SPI) bus. It’s the fastest interface the Arduino Uno has, but even running at 8MHz, it leaves us with an average write speed of only 150KB/second.
Since we only want to store 22.05 (22.2) KB per second, it sounds like we’re fine, but again, this SPI-bus write speed is an average. SD card storage is divided into blocks, each 512 bytes wide, so we have to write the data in 512-byte chunks – but that write process can’t interfere with the ADC sampling every 45microseconds. The problem is, average SD card latency in SPI mode is between 700-900 microseconds.
The simplest write method we could try is to just open up a file on the card, start grabbing samples, count them up as we go and when we hit 512, dump that block to the card using the sample interrupt routine. But if we do that, we’ll still be writing the block when the next sample interrupt request arrives – and that’ll result in either new samples or the entire block being lost.
We solve this problem using a technique commonly found in digital audio design called ‘double buffering’ – we set up two 512-byte blocks of RAM called ‘buffers’, labelled ‘buf00’ and ‘buf01’ in our code. As we start recording samples, we store them in the first buffer. When that buffer is full, we switch over to the second buffer; meanwhile, the first buffer is now saved to the flash card. The 512-byte buffer size gives us roughly 22 milliseconds to get that first buffer written to the flash card.
The 900-microsecond average latency for SD card writes via SPI is way too long to handle inside the 45-microsecond sample interrupt process, but with 22 milliseconds per buffer to play with, we can get the ATMEGA328P to multitask and write the block in between ADC samples.
Once the second buffer fills up, we switch back to loading up the first buffer again that has now been saved and meanwhile, start saving the contents of that second buffer in the next 22 milliseconds. What we end up with is this constant swapping – we’re storing samples in one buffer while writing the other buffer to the flash card. But on the card itself, we end up with a seamless stream of audio data, all done with just 1KB of RAM. (In reality, the Arduino’s SD library grabs its own 512-byte buffer, but without our double-buffering, this project wouldn’t work).
Listen to history in the making
We’ve been recording sound since Thomas Edison invented the tinfoil system in 1877. Check out cylinder recordings. Alan Blumlein’s stereo test recordings from 1933 have also been preserved and restored by the British Library. Listen online here.