Tagging audio files

There are a few ways I can think of for an audio player to show you all relevant information (artist, title, release year, …) about an audio file.

  • derive it on-the-fly from the filename.
    • disadvantages:
      • only a limited amount of fields can be entered before file names become too long
      • changing a field would require you to rename the file, which will make programs no longer find the file.
      • you need to take care when naming your files, changing your filename format afterwards may be difficult.
    • advantages:
      • ummmm…not sure…
  • derive it from metadata added to the file.
    • disadvantages:
      • a lot of fields are standardized, but some are not. (album artist anyone?)
      • file becomes (only slightly) bigger
    • advantages:
      • very flexible: you can add fields whenever you want

It seems obvious that the second way is the better way. One step further is a library-based music player which will read tags of all the audio files and store them in some kind of local database. This allows you for example to say (combined with some form of smart playlist capability): generate me a playlist which holds 2 hours of music but which I haven’t listened to for at least 2 months. How is this done? You (well, not you, a codec developer) take an audio file holding raw audio compressed with a certain codec (mp3, aac, whatever) and put it into a container file. You then add the tags to the container file. In this simple case, the container file holds 2 parts: the audio data and the tags holding information about it. For video files you can have (for example) an audio stream, a video stream, a subtitle stream and a tag stream. This of course requires some kind of standardization for the tagging format… Now for the problem…this standardization is mostly there but not completely. Let’s take one of the more complicated examples: mp3… MP3 tags are standardized using the id3 system. You have several flavours, each adding features:

  • id3v1: supported by everything but tag fields are severely limited in size (32 bytes I believe). Can also only hold pre-defined genres.
  • id3v2: field length size is greatly expanded.
    • id3v2.1: old
    • id3v2.2: old
    • id3v2.3: adds unicode support
    • id3v2.4: fixes some design issues in the id3v2.3 tagging format but is regrettably not supported by most applications, mostly due to inertia. (v2.3 was ‘good enough’)

Now, the way id3 works (and remember that id3 is not used for all formats, mp4 for example uses something completely different) is this: several pre-set fields are defined in the specification (artist, title, album and so on. well they’re named differently in the spec but I’m trying to keep this readable). You cannot just add custom fiels as with some other taggins schemes. You can, however, achieve the same result by using comment fields. Of course, when using a tagging editor (or, in fact, most audio players which support tag editing) you do not have to worry about all this: you just fill out the fields using the interface provided by the programmer. The artist tag for example is well-implemented as a standard in most applications. If you set this tag, most if not all players will show you the artist name. Title, album, genre, tracknumber are also well-standardized. There are also less standardized fields…Sometimes this is due to there not being a field available in the id3 specification, sometimes due to multiple possible fields which might hold the information you want to store and sometimes due to a programmer of a tag editor or audio player not agreeing with the specification.

An excellent example would be the concept of an album artist…The concept is good: for a compilation cd holding multiple artists, you would put all individual artists’ names in the artist tag of the relevant files. when sorting your library by artist, this would make a mess, so you set the album artist to ‘various’. Or, you have a cd which is mostly by the same band but also holds some solo tracks by individual band members. For these tracks you could put the band member as the artist but put the band name as the album artist. (take eg Queen: a lot of solo work was done by Freddy Mercury, Brian May, …)

So what’s the problem with the album artist concept? Some programs read this information from the BAND tag, some read it from the album artist comment field, some from the albumartist field, the list goes on…

As an example, I have a table here mapping iTunes fields to id3 fields. I hope it’s useful and I hope I can add more players in the future. It does not hold the actual id3 field names but a human-readable format for them. (as read from mp3tag. Take care as this is “only” valid for mp3 files, for an mp4 file for example itunes uses albumartist instead of band or album artist…(yeah, it’s a bit messy)

iTunes field id3 field description
album album  
artist artist  
album artist band this is the example I gave above
bpm bpm  
comment comment  
soundcheck value comment itunnorm this one is, eg just ‘itunnorm’ in mp4. you cannot just put a value in, you have to calculate it.
composer composer some people use this for album artists, though it’s ment to be used for classical music
grouping contentgroup  
disc number/total discs discnumber x/y eg 1/4,2/4,3/4 and 4/4 for a 4-album cd box
genre genre  
compilation album itunescompilation put 1 for a compilation
track title title  
track/total tracks track x/y, eg for a 5-song album: 1/5,2/5,3/5,4/5 and 5/5
year year  
album art standard id3 apic frame for embedding cover art into your files

Leave a Reply

Your email address will not be published. Required fields are marked *