Make a simple Arduino MP3 audio player

Microcontrollers are changing the world. The ability to write your own software code, flash it to a chip and have that chip control almost anything is changing the world.

It’s enabling anyone to make almost anything and Arduino shines bright. In this article, we’ll turn an Arduino into a simple MP3 player using an audio codec shield.

Get the code

Download the code for this project from our website.

Unzip it, go into the ‘Sparkfun-MP3’ folder, copy the ‘SdFat’ and ‘SFEMP3Shield’ subfolders into the /libraries subfolder of your Arduino IDE install.

Restart the IDE if it’s already running. Load up the AB11_mp3player.ino code, flash it to your Arduino board, remove the power, install the VS1053 shield with MP3 tracks on a MicroSD card, plug in your headphones, plug the USB power pack in and you should be away.

Don’t have the Arduino IDE? Get the latest version from here.

VS1053 MP3 shield

shieldAudio for Arduino is difficult (it was never designed for it), but made much easier by the VS1053 MP3 shield, an extension board featuring VLSI’s VS1053 multi-codec audio chip.

Read the specs and this is one versatile chip – it plays Ogg Vorbis, MP3, AAC, WMA and WAV audio straight off the bat, and with a software patch, it’ll even play lossless FLAC audio as well.

The VS1053 shield board is even better – it combines MicroSD card storage (up to 32GB) and also takes advantage of the VS1053’s audio recording capabilities (16-bit WAV/PCM or Ogg Vorbis via patch) through a built-in microphone or your own via the 3.5mm mic input.

Delving into the specs further shows the chip can drive a standard 32-ohm headphone load.

Total harmonic distortion (THD) is a reasonable 0.05% at that load and signal-to-noise ratio (SNR) at full-scale is 94dB – not earth-shattering, but still very respectable.

You’ll find the ‘VS1053 MP3 Shield’ on eBay for as little as US$13.

How it works

vs1053Look at the block diagram and the heart of the VS1053 is a DSP – digital signal processor – with its own RAM, control inputs, stereo analog-to-digital converter (ADC) for recording, stereo digital-to-analog converter (DAC) for playback and built-in headphone driver.

All we need is to write code that gets our Arduino board to act as traffic controller and transfer audio data from the MicroSD card to the VS1053 chip and playing out through the headphone socket (the outer of the two).

WARNING – batteries, headphones only

powerHowever, there is one drawback with that headphone socket – and it’s a bit technical – so the short version is the chip’s output has a DC offset, which means you can’t plug it straight into an audio amplifier. If you do, the VS1053 chip will likely blow up.

However, the solution is to use a number of DC-blocking capacitors. The addendum PDF shows you the circuit required.

Otherwise, we recommend you stick to headphones only and power it with a USB battery power pack (for safety, don’t use a USB wall charger).

Power draw is about 80mA, so you should get a genuine full day’s playback from a 2200mAh USB power pack.

VS1053 Arduino library

codeThat’s the hardware side of things; now for the software.

For that, two special Arduino code libraries – SdFat and SFEMP3Shield – do much of the work for us.

The SdFat library takes care of initialising the link between the Arduino microcontroller chip and the MicroSD card, setting its speed (4MHz).

The library only supports FAT filesystem, so make sure you format the MicroSD card to FAT32 before using it here.

The SFEMP3Shield library handles most of the goodies inside the VS1053 chip. By the look of it, it doesn’t do everything (recording, for instance, seems to be missing), but there are still plenty of toys to play with, including support for VU meters, graphic equalisation, even changing the speed of playback.

Much like the Arduino’s ATMEGA328P microcontroller chip, the VS1053 is programmed using a series of registers, each housing a number of controls or ‘bits’ that can either be read from or written to, controlling the various functions inside.

The SFEMP3Shield library provides us with the shortcuts to coding those individual register bits, allowing us to use more recognisable ‘plain-English’ statements in our Arduino code.

How our MP3 player works

To get things rolling, we’ve used both libraries to make a very simple MP3 player with just a few lines of code.

It’s so simple, it has no buttons – it automatically begins playing MP3 tracks it finds on the MicroSD card out through the stereo headphone socket as soon as you apply power. Those tracks can be up to 48kHz sample rate, 16-bit depth, mono or stereo and up to 320Kbps bit rate.

The one limitation needed to make this player as simple as possible, however, is that MP3 tracknames must be ‘track000.mp3’, ‘track001.mp3’ and so on – up to 255. (We’re using a built-in library function called ‘playTrack’, which takes a single 8-bit integer to count/select the tracks).

As soon as power is applied, the player begins with the ‘isPlaying’ flag set to zero – this triggers code to increment the 8-bit counter from -1 to 0 and uses the counter to fire the MP3Play.playTrack() command, looking for ‘track000.mp3’ (count 0) on the MicroSD card.

If it finds the file, it begins playing and the ‘isPlaying’ flag is set to 1. When the track is finished, ‘isPlaying’ goes back to zero, which increments the count again and the next track played (count 1, ‘track001.mp3’).

Once the last track stops playing, the counter increments again and the playTrack() goes looking for the next track. But with no tracks remaining, the playTrack() command, which has been returning a value of ‘0’ to the ‘returnVal’ variable each time, now returns an error-indicating non-zero value.

This triggers our code to reset the counter to zero and tells the playTrack() command to play the corresponding file (‘track000.mp3’), creating a very simple ‘repeat-all’ loop.

Setting audio output level

We’ve noted it in the source code, but the statement for setting audio output levels works back-to-front from what you might expect – the library code we’re using expects to see an 8-bit integer with ‘0’ for maximum output and ‘255’ for silence.

Each increment apparently represents a 0.5dB fall in output level, but we’re not convinced – still, we’ve set it to a nominal ‘40’. Be warned though – setting it to ‘0’ will drive your low-ohm headphones hard and loud!