Honey, I shrunk the MP3 decoder
A few months ago, I implemented a minuscule MPEG Audio Layer II decoder. While I still consider this as a cool hack, it’s not of too much use nowadays. Everyone uses MP3 or Vorbis; MPEG Audio Layer II is only used together with MPEG Video (think VCD, SVCD and DVB), but you usually don’t have MP2 audio files on your disk. So the aim was clear: I wanted to have an as-small-as-possible MP3 decoder, too.
This time, I didn’t even try to implement it myself. Layer II was already hard to understand, but Layer III is about 3 to 5 times more complex. Considering the bad quality of the standard, I decided against a fresh reimplementation. Instead, I looked at the large pool of existing open-source decoding software.
The most common MP3 decoder in the F/OSS world is mpg123 and the mpglib library based off it, so this was naturally my first choice. Digging deeper into the source, I found that it is too complex, hard to read and even harder to port. It is chock full of #ifdefs, uses assembler code and floating-point math all over the place and is quite big overall. So I checked the next alternative: FFmpeg. The MPEG Audio decoder of libavcodec is already implemented in a single file, so I used it as a starting point. I had to dig a bit through all the FFmpeg intricacies (to find out that the bitstream reader is actually a bunch of intertwined and largely redundant macros) to get it together, but finally it worked.
The harder part was to make a really small Win32 executable. Creating a debug build in Visual C++ Express was quite easy, but the Release build without C runtime library dependencies took me quite some time to finish. The special problem were some weird floating-point operations like frexp() and pow() used during initialization. These operations don’t map directly to a x87 FPU instruction, so the compiler insisted on calling a library function. I needed to create a replacement of these two functions, but finding out how to implement them wasn’t trivial at all. Well, at least I learned a thing or two about floating-point assembler programming :)
All in all, the result is roughly 1680 lines of code (not counting empty lines, comments and 600 lines of tables) that compile to a 28k executable. The fabulous executable compressor kkrunchy manages to pack this down to a reasonable 13312 bytes. Not quite as small as the Layer II decoder, but OK. I guess that a carefully-designed full reimplementation would be possible in 8 or 10k, but I’ll leave it to someone else to do that.
The small size comes at a cost, though. The FFmpeg MP3 decoder is neither fast nor bug-free. It uses around 1-2% CPU on my Athlon64 3000+ (fully optimized, of course), while other decoders can’t be measured at all (i.e. below 1%). An even greater problem is a bug in the huffmann decoder that renders some (but not all) silent parts in some (but not all) files wrong. I first thought that has to be a problem of my port, but it’s there in vanilla FFmpeg, too, so I’m writing a bug report and leave it to Fabrice and the others to fix it :)
[Update 2007-02-05: Two days ago, Michael Niedermayer (a lead FFmpeg developer) fixed the bug. Now minimp3 decodes all files correctly o/]
If you don’t mind the shortcomings of the current implementations, feel invited to try it or use it in your projects. My personal incentive was to get rid of bass.dll or fmod.dll when writing demos – but if anyone feels like implementing a full-featured graphical MP3 player in, say, 64k, I’d love to see that, too :)
Download
- minimp3.tar.gz (51k) — the source code and compiled example application

very cool, and drag and drop works on the exe :)
You would better use Vorbis. It has much better quality, than Mp3
simply awesome! although I would expect on windows it would be much smaller just to call the appropriate directshow functions to play the mp3. nevertheless, very kool! i’m glad you did this, as i didn’t want to use your mp2 for the very reasons you outlined here. great work!
aboeing: You’re right, just having DirectShow play a MP3 file is much simpler and smaller than this decoder. But it has some limitations, though:
1. Results vary depending on which MP3 decoder gets chosen. This would not be an issue if all decoders did their job well, but I remember that even the standard decoder that comes with Windows has some severe (audible!) bugs. I’m not going to say that minimp3 is bug-free (it’s certainly not), but it’s a fixed implementation you can test with and be sure that it sounds the same on any system.
2. As far as I know, DirectShow can’t play from memory chunks easily.
3. Finally, DirectShow isn’t exactly portable :)
Depending on what application you target at, these points might or might not apply for you. I developed minimp3 as part of a demo engine, and for this type of application, these three points matter :)
Hey KeyJ,
Yes, I’ve had trouble with getting DirectShow to play from and TO memory chunks and it certainly isn’t portable. I’ll see if I can get your MP3 player going on an embedded device we do some research on and I’ll send you a link if it works :)
(I wasn’t aware DirectShows MP3 decoders were buggy, I’ll have to look in to that!)
Well, every decoder is buggy at some point, I think :) But I remember hearing a suspiciously high amount of “blubs” in MP3s decoded with Windows Media Player when compared to WinAMP a while back. The problem might have been solved by now, but I learned my lesson already :)
Great work on this tiny MP3 player. How much work would be required to create a managed class wrapper, so this could be used within C#?
Mark: Not much, I think … on the C side, it would mean that the decoder needs to be compiled into a DLL, and on the C# side, it would require some standard P/Invoke stuff, I guess. Frankly, I didn’t dig very deep into .NET yet, so I can’t say much more about this.
Anyway, if you’re using C#, having a small native executable most likely isn’t one of your goals. So I rather recommend using some of the higher-level APIs like Windows Media / DirectShow or a ready-made third-party library like Bass.Net might be a better choice.
KeyJ: Thanks for your response. I’ve already looked at DirectShow, but it’s rather difficult to use in C#. Bass.Net looks like a great option however. Another alternative I’ve found is a managed Ogg library (http://www.j-ogg.de/CsOgg.zip) that may suit my needs.
Your entry points for the library seem to have a bug.
The mp3_decoder_t is typedef’d as:
typedef void* mp3_decoder_t;
And rightly so, the main create function does this:
mp3_decoder_t mp3_create(void)
{
// Something along the lines of:
return calloc(mp3_context_t);
}
Which is fine, since an mp3_context_t* is converted to a void*.
However, the _decode function does this:
int mp3_decode(mp3_decoder_t *dec, void *buf, int bytes, signed short *out, mp3_info_t *info)
Notice that the first parameter is an mp3_decoder_t*, which is effectively a void**. This parameter is then cast to an mp3_context_t* (which is wrong).
Furthermore, mp3_done has the same flaw.
This can be fixed by removing one pointer indirection to the mp3_done and mp3_decode function.
Needless to say, the reason why the Windows example compiles and runs is b/c in the calling code, the mp3_decoder_t is passed, as opposed to mp3_decoder_t* (which would essentially crash if it did).
Also, in the mp3_done function, there is no need to check for the pointer before free’ing. As per the C standard, free(NULL) is a no-op.
Dude, you rule ! I was wandering the net for a simple and LIGHT mp3 decoder : libmad, mpglib, oh god … then i found it !
Thx a bunch ! ( i added loop and stream from memory, now to implement simple fft )