[ID3 Dev] New genre coding idea?

Mitchell S. Honnert mitch at honnert.com
Sat Mar 4 15:03:13 PST 2006


>At one point it was determined that the functionality of ID3v1 was
insufficient to meet people's needs.

But more importantly, at a later point - the creation of the ID3v2 standard,
to be precise - it was determined that a fixed set of genre values was *not*
sufficient to meet people's needs.  In other words, at some point, the
enough of the people who were determining the ID3v2 standard agreed that
there was just no way to come up with a finite list of values for something
so subjective as "Genre" so the best approach was to just let the field be
free-form.

 

>It's time to address the deficiencies of the current genre model in some
way.  While the need for exotic genre naming schemes might be outside 

>the realm of an orderly standardized word set, we should also question the
usefulness of retaining functionality which I suspect is seeing limited use.


What are these deficiencies to which you refer?  I would agree that the
vagueness of the standard around how to implement multiple genres has
limited this functionality, but how is a free-form genre value - single or
multiple - a deficiency?  I can understand the appeal of having a uniform
set of genres, but it would be highly impractical and outside the scope of
this list to be responsible for perpetually maintaining an ever-changing
list of genre values for every music category in the world.  But this isn't
a deficiency in the standard; it's just the nature of the concept of
"genre".

 

>Now I know that your groove music will be sorted correctly with my groove
music - cool!

But the assignment of what I think it "groove" or whatever is so subjective
that even if you did come up with a grand master genre list, there's no
guarantee that what I tagged as "groove" would be what you would tag a
"groove".

 

>As for "Mitch's Favs", it should be expressed by virtue of a rating value,
the sort of thing so many audio library programs already use.

I actually don't use genre in the fashion.  I was using this as an example
of how people might be currently using the genre field to categorize their
music.  It's all well and good to tell people that they shouldn't use the
genre in this way, but it's another to change the standard so that
ID3-compatible libraries and applications would drop your genre because it
couldn't find a match in its master list.  How do you think WinAmp users
would feel if half their genres got erased because when rewriting their
files, the ID3 library that it uses went to a new version of the standard
that restricted what genres they were allowed to have?  Do you think they'd
appreciate losing their data in the name of conformity?

 

Again, I can sympathize with your desire for a uniform set of genre values,
but (as I think has been discussed in this list before) maintaining a list
of something so subjective would be so impractical that any benefits would
be far outweighed by the effort to maintain the list.

 

 - Mitchell S. Honnert

 

 

 

 

  _____  

From: Pat Furrie [mailto:pfurrie at hotmail.com] 
Sent: Saturday, March 04, 2006 5:11 PM
To: id3v2 at id3.org
Subject: RE: [ID3 Dev] New genre coding idea?

 

Mitch,

 

Thanks for your reply.  

 

At one point it was determined that the functionality of ID3v1 was
insufficient to meet people's needs.  It's time to address the deficiencies
of the current genre model in some way.  While the need for exotic genre
naming schemes might be outside the realm of an orderly standardized word
set, we should also question the usefulness of retaining functionality which
I suspect is seeing limited use.  You're right - those people with
"Venezuelan Beaver Chill Out Groove" might not be able to have that precise
set of terms, though "Venezuelan," "Chill," and "Groove" would probably be
in the list, and you'd have them spelled the same, correct way every time;
who knows how many different ways people would misspell "Venezuelan."  This
is helpful when you share your music collection with me, and my software
doesn't have to include some funky new custom genre (one which I'd otherwise
never use, or might be spelled differently than my version.  Now I know that
your groove music will be sorted correctly with my groove music - cool!  As
for "Mitch's Favs", it should be expressed by virtue of a rating value, the
sort of thing so many audio library programs already use.

 

It would be great --  when I buy an MP3 online - for it to have a bunch of
genre and modifier selections already selected, saving me the effort of
having to go classify them all myself.  But if the classification system
can't be localized, it makes it less useful for handling everyone.

 

One way to rectify the backward compatibility issue: this is just another
genre list.  Just as some files have genre information recorded in both the
ID3v1 and ID3v2 tags simultaneously, this adds another option of
functionality.  It's not as elegant, but it doesn't break anything either.

 

Pat

 

 

  _____  

From: Mitchell S. Honnert [mailto:mitch at honnert.com] 
Sent: Saturday, March 04, 2006 4:40 PM
To: id3v2 at id3.org
Subject: RE: [ID3 Dev] New genre coding idea?

 

Pat, it's obvious you put a lot of thought into this proposal, but I think
there's a "showstopper" item missing from your Disadvantages list.
Converting the existing genre frame to support a bitmap style representation
of multiple genres would break backward compatibility.  The ID3 standard has
had free-form genres for so long, it'd be next to impossible to go back to
using a fixed list, even one that allowed for modified pairs like you
described.  For example, if your proposal were adopted, how should an
ID3-compatible program handle the genre of "Venezuelan Beaver Cheese Chill
Out Groove" or even "Mitch's Favs".  You just never know what people have
put in the Genre field, so there's really no way to convert these values and
go back to a fixed set of values.  The genie is out of the bottle, so to
speak.

 

However, this doesn't mean that we can't clarify the genre frame
specification to promote the use of multiple genres.  The only application
that I've found that supports multiple genres is ID3-TagIT.  It uses what I
think is the simplest, best approach for multiple genres which is simply
using a null char delimited list of genre values.  In fact, I've adopted its
format in my own ID3 library, UltraID3Lib.)  This technique has the benefit
of being backward compatible with all ID3v2 versions.  (Apps should just
ignore anything after the first null char delimiter, so they wouldn't fully
support the format, but neither would this format break compatibility.)

 

Anyway, not that I want to be such a downer, but I think it's too late to go
to a limited set of genre values.  There are just too many weird genres out
there to ever be codified in a manageable list.

 

Mitchell S. Honnert

www.UltraID3Lib.com

 

 

  _____  

From: Pat Furrie [mailto:pfurrie at hotmail.com] 
Sent: Saturday, March 04, 2006 3:31 PM
To: id3v2 at id3.org
Subject: [ID3 Dev] New genre coding idea?

 

I've been considering how multiple genres can be assigned via ID3.

 

What I do understand:

1)   ID3v1 had a single byte which mapped to a fixed and limited list of
genres.

2)   ID3v2 allows for a genre frame which can have some mix of genres.

3)   Very few programs, if any, are using the multiple genre capability.

4)   Programs like iTunes seem to allow multiple genres, but in reality
users are only creating a new genre which is the composite of existing
types.  However there is no cross-referencing: I can give an MP3 the "genre"
of "rock soundtrack" but if I list everything in the "rock" or "soundtrack"
genres, it doesn't show up.  Additionally, "soundtrack, rock" is not
equivalent to "rock, soundtrack" in the iTunes world.

 

One way of dealing with it is to utilize free-form text with delimiters,
allowing users to enter as many of whatever genres they come up with.
However, this has a few problems:

1)   Because the length of the field is unknown, padding must be used, and
the possibility still exists of needing to re-write the rest of the file due
to exceeding the space given by any padding.

2)   Any spelling errors create additional genres when they shouldn't exist
(does "soundtrack" equal "sound track"?)

3)   What works for English doesn't work for other languages.  Localization
is difficult.

 

The original one-byte method in ID3v1 did have the potential of being able
to be localized, since the genre name lookup table could be altered for
whatever the user's language is.  But it only allows a single genre, and it
is easy to show that most files (of any type) can be reasonably listed in
multiple categories.

 

Another solution is to use the same look-up table, but have multiple
delimited values listed.  

 

Let's bit-map the genre information.  Make the bit-mapped switches
correspond to smaller divisions of description as opposed to compounded
versions.  For example, terms like "Rock," "Classic Rock," "AlternRock,"
"Instrumental Rock," "Gothic Rock," "Folk-Rock," "Progressive Rock,"
"Psychedelic Rock," "Southern Rock," "Symphonic Rock," "Hard Rock," "Slow
Rock," and "Punk Rock" are replaced by "Rock," "Classic," "Alternate,"
"Instrumental," "Gothic," "Folk," Progressive," "Psychedelic," "Southern,"
"Symphonic," "Hard," "Slow," and "Punk."  Note that any of those terms can
be combined with any of the others to give a larger variety of unique genres
(ie, "Instrumental Punk" and "Southern Folk"), as well as with a myriad of
other terms condensed from the current list of genres, and a host of
additional modifiers and qualifiers.  This atomized approach to the
subjective art of classification allows for a far more expansive range of
control and choice while keeping spellings and terms standardized.  And
while there are fewer than 150 genres in the more-or-less standard version,
this new approach allows for, well. an awfully big number. 2 to the number
of bits used in the bit map.

 

After doing some digging up of genre types from the Internet, it seems 1000
bits should be sufficient.  This includes a wide range of modifying terms
and plenty of reserved space for the future.  Doing a little rounding to
come up with an even number, 150 bytes appears to be a good number to
consider for the size of the reserved space.

 

 

Bitmap Advantages:

-       Efficient.  With a small amount of file space, a very large number
of genres can be described.

-       Complete.  A user can have any combination of a large number of
genre and modifiers to describe each audio file.

-       Language independent.  The bits map to terms which can be localized
to a user's language.

-       Can be cross-referenced.  Files can easily be sorted on a single
genre attribute or any combination.

-       Changes don't ever affect a file's size.  Once the fixed-length
bit-map has been attached, regardless of what genre/modifier values have
been selected, genre changes will no longer have any impact on file size.

-       Standardized terms and spellings.  A user has a hard enough time
entering the same text each time exactly the same, let alone having
different users entering textual genre information the same.  Consistency in
spelling and terms becomes possible when the choices are mapped to a fixed
word list.

-       Room for term growth.  The full list of genre switches wouldn't be
exhausted, leaving head-room for additional bit-mappable items to be added
over time.  

 

Bitmap Disadvantages

-       No user-customized genres.  This is the case with the ID3v1 scheme
as well.  However, the baseline list of proposed genres and modifiers is
much more vast, and the freedom for creating any combination gives choices
many orders of magnitude greater than before.  Also, by having standardized
genres, you can be sure that when you import an audio file with this scheme
into your own database, the genres will mesh - no ambiguity due to spelling,
language, or word order (that is, "soundtrack rock" will equal "rock
soundtrack".)

-       Not "human readable."  The genre and modifier data is bit-coded, so
if a user were to open such a file with a text viewer/editor, they'd have
difficulty making sense of it.  However, most users never have the need to,
know how to, or want to open audio files with a text editor.  They'll use
software designed for viewing and editing the tags.  Also, coded in hex,
this could easily exist with an XML framework, if need be.

-       Always consumes 150 bytes.  From a relative point-of-view, this is a
big increase. This is a lot more than the 1 byte for ID3v1.  However, from
an absolute viewpoint, this isn't much.  For a user with ten-thousand MP3s,
the total impact byte-wise is 1.5 megs.  Using my collection of slightly
more than ten-thousand MP3s as an example, it consumes just over forty
gigabytes of disk space.  1.5 megs of additional space amounts to a
percentage increase of less than 4 one-thousandths of one percent.

 

--------------

Pat

 

 

 


--
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.1.375 / Virus Database: 268.1.2/274 - Release Date: 3/3/2006



--
No virus found in this incoming message.
Checked by AVG Free Edition.
Version: 7.1.375 / Virus Database: 268.1.2/274 - Release Date: 3/3/2006



--
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.1.375 / Virus Database: 268.1.2/274 - Release Date: 3/3/2006


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.id3.org/pipermail/id3v2/attachments/20060304/dac75b92/attachment.html>


More information about the ID3v2 mailing list