From sdirickson at real.com Thu Apr 28 11:14:50 2011 From: sdirickson at real.com (Steve Dirickson) Date: Thu, 28 Apr 2011 11:14:50 -0700 Subject: [ID3 Dev] Encoding UTF-16 (i.e UTF-16 with BOM which is the most compatible choice Little Endian or Big Endian ?) In-Reply-To: <4DB86230.6060604@fastmail.fm> References: <4DB7ED1A.9060407@fastmail.fm> <4DB803C1.2060907@fastmail.fm> <4DB80FC9.7030807@fastmail.fm> <4A242CD46F4C34418D17941FF9746F9774438C59E2@SEAMBX.corp.real.com> <4DB86230.6060604@fastmail.fm> Message-ID: <4A242CD46F4C34418D17941FF9746F9774438C5EB8@SEAMBX.corp.real.com> Then I guess it depends on your definition of "most compatible". If you use BE, Windows machines, Intel-based Macs, etc. will have to do the byte swapping; if you use LE, other architectures will have to do so. Does that matter? WRT unsync, if you aren't using it, the BOM issue is a don't-care. If you are using it-stop! ;-) Since you want to stay 2.3-compatible, you're going to be using a BOM either way; unsync just complicates the issue. If you're trying to figure out how to avoid breaking the smallest number of broken apps that don't properly handle Unicode tags, that's an exercise in frustration. Unless you know that a significant share of your target user base uses a known-broken app that happens to work with one but not the other, I'd say pick one, and accept that some number of misbehaving apps are going to show garbage to the user. From: Paul Taylor [mailto:paul_t100 at fastmail.fm] Sent: Wednesday, 27 April, 2011 11:37 To: id3v2 at id3.org Cc: Steve Dirickson Subject: Re: [ID3 Dev] Encoding UTF-16 (i.e UTF-16 with BOM which is the most compatible choice Little Endian or Big Endian ?) On 27/04/2011 18:03, Steve Dirickson wrote: I think the key is that UTF-16BE is equivalent to "network byte order". Any app that produces Unicode for external consumption really should provide the BOM. But, if it doesn't, the only reasonable assumption the recipient can make is that the text is in network byte order. The alternative is to try heuristics and look for lots of binary zero values (or lots of the same small value) in every other byte, and then make the call based on whether those recurring values are in the even or odd bytes. I think you are missing my point Im NOT talking about the UTF-16BE encoding but the UTF-16 encoding (Which is UTF with BOM and can contain LE or BE data). I have no problem reading or writing the data but would like to know which is the most compatible choice. When embedded within an mp3 UTF-16 is not magically decoded by the operating system, it has to be decoded by the application, and Im sure there are some applications that can embed BOM LE but not BOM BE or vice versa. There is also the complication that the Byte order marks themselves in BOM LE requires unsynchronization if you are using unsynchronization whereas BOM BE does not, and applications such as Windows 7 Explorer itself don't understand unsynchronization making me think that BOM BE is more compatible. Paul -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul_t100 at fastmail.fm Wed Apr 27 03:16:58 2011 From: paul_t100 at fastmail.fm (Paul Taylor) Date: Wed, 27 Apr 2011 11:16:58 +0100 Subject: [ID3 Dev] Encoding UTF-16 (i.e UTF-16 with BOM which is the most compatible choice Little Endian or Big Endian ?) Message-ID: <4DB7ED1A.9060407@fastmail.fm> When encoding string as UTF-16 (i.e UTF with BOM) which is the most compatible choice Little Endian or Big Endian, should you chnage default based on what latform you are on ? i.e Encoding Beck as BOM Big Endian would be FE FF 00 62 00 65 00 63 00 6B and BOM little Endian would be FF FE 62 00 65 00 63 00 6B 00 Seems Big Endian might be better because wouldn't cause a load of unsynchronization if unsynchronizations was enabled. But I know Macs and Pcs tend to default to the opposite for this sort of stuff Paul --------------------------------------------------------------------- To unsubscribe, e-mail: id3v2-unsubscribe at id3.org For additional commands, e-mail: id3v2-help at id3.org From paul_t100 at fastmail.fm Wed Apr 27 05:44:57 2011 From: paul_t100 at fastmail.fm (Paul Taylor) Date: Wed, 27 Apr 2011 13:44:57 +0100 Subject: [ID3 Dev] Encoding UTF-16 (i.e UTF-16 with BOM which is the most compatible choice Little Endian or Big Endian ?) In-Reply-To: References: <4DB7ED1A.9060407@fastmail.fm> <4DB803C1.2060907@fastmail.fm> Message-ID: <4DB80FC9.7030807@fastmail.fm> On 27/04/2011 13:37, Pierre-Yves Thoulon wrote: > Yes, I understand your concern. But it is not an issue of platform, > e.g. MACs vs. PCs, because the UTF-16 format must specify which order > is used through the BOM, and ID3 tags have to be platform agnostic... > Note that v2.3 only support UTF-16 LE. v2.4 supports all formats. That doesnt sound right, AFAIK ID3v23 supports UTF-16 BOM ,and the BOM can be either LE or BE, why do you think it only supports BOM LE ? > > I ran a small experiment with iTunes (*not* a decent decoder...), > which apparently does not like UTF-16 LE... (prints a "?" instead of > UTF-16 characters) Did you use a BOM or just UTF-16 LE, this isnt a valid option ? Paul --------------------------------------------------------------------- To unsubscribe, e-mail: id3v2-unsubscribe at id3.org For additional commands, e-mail: id3v2-help at id3.org From wheeler at kde.org Fri Apr 29 06:49:41 2011 From: wheeler at kde.org (Scott Wheeler) Date: Fri, 29 Apr 2011 15:49:41 +0200 Subject: [ID3 Dev] New to the list In-Reply-To: <20110429133727.GA27853@limey.net> References: <002401cc0667$dcc86320$96592960$@com> <20110429133727.GA27853@limey.net> Message-ID: On Apr 29, 2011, at 3:37 PM, Ben Bennett wrote: > However... the players mostly don't use it. You should hound them to > get it used. I believe rightly so. I don't think that a normal operation like playing a file should trigger a potentially lossy (e.g. ID3v2.2 to ID3v2.4 upgrade) action on a user's files. -Scott (to the mods, please just delete the earlier post saying something similar, sent it from the wrong address) --------------------------------------------------------------------- To unsubscribe, e-mail: id3v2-unsubscribe at id3.org For additional commands, e-mail: id3v2-help at id3.org From benf at damasoftware.com Fri Apr 29 05:20:40 2011 From: benf at damasoftware.com (Ben Fourie) Date: Fri, 29 Apr 2011 14:20:40 +0200 Subject: [ID3 Dev] New to the list Message-ID: <002401cc0667$dcc86320$96592960$@com> Hey Every one, I'm new to the list, and will be following id3v2 development closely from now on. I have a suggestion regarding a new tag frame or perhaps 2. Nearly every single application out there builds up their own database of metadata. One field that stands out that you find in all of those applications is a rating field. The typical 5 star rating. This information is stored by the application, and in most cases cannot be migrated to another platform. This will prevent most users (or at the very least make them very reluctant) to switch to another player. The second risk is loss of storage media. You can backup your mp3 library but if you lose your OS drive(or the drive that holds the metadata for your media player), all your ratings will be gone, even after a system restore. A suggestion from my side: Have 2 rating frames. One for your own personal rating with simply a numerical value from 1 - 5. Then another rating frame for a public rating score. Personal rating would contain the rating you chose for each individual score. When you sync iTunes for instance, apple can then collect all those ratings and build a public rating for matching media. Doing it this way, will of course allow you to migrate from the dying Zune to iPod faster than you can swipe your credit card :-p without the fear of losing all your ratings. I would love to hear your thoughts around this. Ben Fourie DaMa Software 083 262 3555 www.damasoftware.com benf at damasoftware.com MCTS(rgb)_507 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 3358 bytes Desc: not available URL: From developer at audioranger.com Thu Apr 28 03:27:59 2011 From: developer at audioranger.com (Audio Ranger Development) Date: Thu, 28 Apr 2011 12:27:59 +0200 Subject: [ID3 Dev] Encoding UTF-16 (i.e UTF-16 with BOM which is the most compatible choice Little Endian or Big Endian ?) In-Reply-To: <4DB86230.6060604@fastmail.fm> References: <4DB7ED1A.9060407@fastmail.fm> <4DB803C1.2060907@fastmail.fm> <4DB80FC9.7030807@fastmail.fm> <4A242CD46F4C34418D17941FF9746F9774438C59E2@SEAMBX.corp.real.com> <4DB86230.6060604@fastmail.fm> Message-ID: <4DB9412F.9080707@audioranger.com> It shouldn't matter. Applications which can handle UTF-16 should be able to decode both UTF-16BE and UTF-16LE. If you're in doubt, you may simply test this with major music players in order to be sure. Unsynchronization is a different problem, many applications don't support it. So don't use it, unless it's actually required. This however should only be the case if really old ID3 software or hardware is being used. Mathias K. Am 27.04.2011 20:36, schrieb Paul Taylor: > I think you are missing my point Im NOT talking about the UTF-16BE > encoding but the UTF-16 encoding (Which is UTF with BOM and can contain > LE or BE data). I have no problem reading or writing the data but would > like to know which is the most compatible choice. When embedded within > an mp3 UTF-16 is not magically decoded by the operating system, it has > to be decoded by the application, and Im sure there are some > applications that can embed BOM LE but not BOM BE or vice versa. There > is also the complication that the Byte order marks themselves in BOM LE > requires unsynchronization if you are using unsynchronization whereas > BOM BE does not, and applications such as Windows 7 Explorer itself > don't understand unsynchronization making me think that BOM BE is more > compatible. > > Paul --------------------------------------------------------------------- To unsubscribe, e-mail: id3v2-unsubscribe at id3.org For additional commands, e-mail: id3v2-help at id3.org From sdirickson at real.com Wed Apr 27 10:03:28 2011 From: sdirickson at real.com (Steve Dirickson) Date: Wed, 27 Apr 2011 10:03:28 -0700 Subject: [ID3 Dev] Encoding UTF-16 (i.e UTF-16 with BOM which is the most compatible choice Little Endian or Big Endian ?) In-Reply-To: References: <4DB7ED1A.9060407@fastmail.fm> <4DB803C1.2060907@fastmail.fm> <4DB80FC9.7030807@fastmail.fm> Message-ID: <4A242CD46F4C34418D17941FF9746F9774438C59E2@SEAMBX.corp.real.com> I think the key is that UTF-16BE is equivalent to "network byte order". Any app that produces Unicode for external consumption really should provide the BOM. But, if it doesn't, the only reasonable assumption the recipient can make is that the text is in network byte order. The alternative is to try heuristics and look for lots of binary zero values (or lots of the same small value) in every other byte, and then make the call based on whether those recurring values are in the even or odd bytes. From: py.thoulon at gmail.com [mailto:py.thoulon at gmail.com] On Behalf Of Pierre-Yves Thoulon Sent: Wednesday, 27 April, 2011 6:57 To: id3v2 Subject: Re: [ID3 Dev] Encoding UTF-16 (i.e UTF-16 with BOM which is the most compatible choice Little Endian or Big Endian ?) That doesnt sound right, AFAIK ID3v23 supports UTF-16 BOM ,and the BOM can be either LE or BE, why do you think it only supports BOM LE ? You're right, sorry, got confused with UTF-8 support... I ran a small experiment with iTunes (*not* a decent decoder...), which apparently does not like UTF-16 LE... (prints a "?" instead of UTF-16 characters) Did you use a BOM or just UTF-16 LE, this isnt a valid option ? I stand corrected. I thought I had written a UTF-16 but I hadn't (had to look at the raw file to figure it out). iTunes does accept UTF-16/LE with BOM (2.3 and 2.4) and UTF-16BE without BOM (2.4 only) for whatever little test I ran. Pyt. -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul_t100 at fastmail.fm Fri Apr 29 11:27:03 2011 From: paul_t100 at fastmail.fm (Paul Taylor) Date: Fri, 29 Apr 2011 19:27:03 +0100 Subject: [ID3 Dev] New to the list In-Reply-To: References: <002401cc0667$dcc86320$96592960$@com> <20110429133727.GA27853@limey.net> Message-ID: <4DBB02F7.6000004@fastmail.fm> On 29/04/2011 14:49, Scott Wheeler wrote: > On Apr 29, 2011, at 3:37 PM, Ben Bennett wrote: > >> However... the players mostly don't use it. You should hound them to >> get it used. > I believe rightly so. I don't think that a normal operation like playing a file should trigger a potentially lossy (e.g. ID3v2.2 to ID3v2.4 upgrade) action on a user's files. > > -Scott Scott I think you ae confusing POPM (rating with PCNT (play count) Paul --------------------------------------------------------------------- To unsubscribe, e-mail: id3v2-unsubscribe at id3.org For additional commands, e-mail: id3v2-help at id3.org From fiji at limey.net Fri Apr 29 06:37:27 2011 From: fiji at limey.net (Ben Bennett) Date: Fri, 29 Apr 2011 09:37:27 -0400 Subject: [ID3 Dev] New to the list In-Reply-To: <002401cc0667$dcc86320$96592960$@com> References: <002401cc0667$dcc86320$96592960$@com> Message-ID: <20110429133727.GA27853@limey.net> There is such a tag "POPM": http://id3.org/id3v2.3.0#head-2452ec9cf8b42c5c117b518b69e129ff67970852 However... the players mostly don't use it. You should hound them to get it used. It incorporates both a rating and a play count. There can be multiple in the same file, with different email addresses to distinguish them. -ben On Fri, Apr 29, 2011 at 02:20:40PM +0200, Ben Fourie wrote: > Hey Every one, > > I'm new to the list, and will be following id3v2 development closely from > now on. > > > > I have a suggestion regarding a new tag frame or perhaps 2. > > > > Nearly every single application out there builds up their own database of > metadata. One field that stands out that you find in all of those > applications is a rating field. The typical 5 star rating. This information > is stored by the application, and in most cases cannot be migrated to > another platform. > > This will prevent most users (or at the very least make them very reluctant) > to switch to another player. > > > > The second risk is loss of storage media. > > You can backup your mp3 library but if you lose your OS drive(or the drive > that holds the metadata for your media player), all your ratings will be > gone, even after a system restore. > > > > A suggestion from my side: > > > > Have 2 rating frames. > > One for your own personal rating with simply a numerical value from 1 - 5. > > Then another rating frame for a public rating score. > > Personal rating would contain the rating you chose for each individual > score. > > When you sync iTunes for instance, apple can then collect all those ratings > and build a public rating for matching media. > > > > Doing it this way, will of course allow you to migrate from the dying Zune > to iPod faster than you can swipe your credit card :-p without the fear of > losing all your ratings. > > > > I would love to hear your thoughts around this. > > > > > > > Ben Fourie > > DaMa Software > > 083 262 3555 > > www.damasoftware.com > > benf at damasoftware.com > > MCTS(rgb)_507 > > > > > > > --------------------------------------------------------------------- To unsubscribe, e-mail: id3v2-unsubscribe at id3.org For additional commands, e-mail: id3v2-help at id3.org From paul_t100 at fastmail.fm Wed Apr 27 11:36:32 2011 From: paul_t100 at fastmail.fm (Paul Taylor) Date: Wed, 27 Apr 2011 19:36:32 +0100 Subject: [ID3 Dev] Encoding UTF-16 (i.e UTF-16 with BOM which is the most compatible choice Little Endian or Big Endian ?) In-Reply-To: <4A242CD46F4C34418D17941FF9746F9774438C59E2@SEAMBX.corp.real.com> References: <4DB7ED1A.9060407@fastmail.fm> <4DB803C1.2060907@fastmail.fm> <4DB80FC9.7030807@fastmail.fm> <4A242CD46F4C34418D17941FF9746F9774438C59E2@SEAMBX.corp.real.com> Message-ID: <4DB86230.6060604@fastmail.fm> On 27/04/2011 18:03, Steve Dirickson wrote: > > I think the key is that UTF-16BE is equivalent to "network byte > order". Any app that produces Unicode for external consumption really > should provide the BOM. But, if it doesn't, the only reasonable > assumption the recipient can make is that the text is in network byte > order. The alternative is to try heuristics and look for lots of > binary zero values (or lots of the same small value) in every other > byte, and then make the call based on whether those recurring values > are in the even or odd bytes. > I think you are missing my point Im NOT talking about the UTF-16BE encoding but the UTF-16 encoding (Which is UTF with BOM and can contain LE or BE data). I have no problem reading or writing the data but would like to know which is the most compatible choice. When embedded within an mp3 UTF-16 is not magically decoded by the operating system, it has to be decoded by the application, and Im sure there are some applications that can embed BOM LE but not BOM BE or vice versa. There is also the complication that the Byte order marks themselves in BOM LE requires unsynchronization if you are using unsynchronization whereas BOM BE does not, and applications such as Windows 7 Explorer itself don't understand unsynchronization making me think that BOM BE is more compatible. Paul -------------- next part -------------- An HTML attachment was scrubbed... URL: From pgbennett at comcast.net Sat Apr 30 08:41:47 2011 From: pgbennett at comcast.net (Peter Bennett) Date: Sat, 30 Apr 2011 11:41:47 -0400 Subject: [ID3 Dev] New to the list In-Reply-To: <002401cc0667$dcc86320$96592960$@com> References: <002401cc0667$dcc86320$96592960$@com> Message-ID: <4DBC2DBB.90403@comcast.net> Hi FYI - I have an application for backing up your tags - tagbkup, backs up and restores your ID3 tags so that if you have to restore your disk your ratings and other updates can be restored, and you do not have to back up entire mp3 libraries. See http://jampal.sf.net Peter On 4/29/2011 8:20 AM, Ben Fourie wrote: > > Hey Every one, > > I'm new to the list, and will be following id3v2 development closely > from now on. > > I have a suggestion regarding a new tag frame or perhaps 2. > > Nearly every single application out there builds up their own database > of metadata. One field that stands out that you find in all of those > applications is a rating field. The typical 5 star rating. This > information is stored by the application, and in most cases cannot be > migrated to another platform. > > This will prevent most users (or at the very least make them very > reluctant) to switch to another player. > > The second risk is loss of storage media. > > You can backup your mp3 library but if you lose your OS drive(or the > drive that holds the metadata for your media player), all your ratings > will be gone, even after a system restore. > > A suggestion from my side: > > Have 2 rating frames. > > One for your own personal rating with simply a numerical value from 1 > -- 5. > > Then another rating frame for a public rating score. > > Personal rating would contain the rating you chose for each individual > score. > > When you sync iTunes for instance, apple can then collect all those > ratings and build a public rating for matching media. > > Doing it this way, will of course allow you to migrate from the dying > Zune to iPod faster than you can swipe your credit card :-p without > the fear of losing all your ratings. > > I would love to hear your thoughts around this. > > */Ben Fourie/* > > */DaMa Software/* > > */083 262 3555/* > > */www.damasoftware.com /* > > */benf at damasoftware.com /* > > > > MCTS(rgb)_507 > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 3358 bytes Desc: not available URL: From pierre-yves.thoulon at centraliens.net Wed Apr 27 04:48:27 2011 From: pierre-yves.thoulon at centraliens.net (Pierre-Yves Thoulon) Date: Wed, 27 Apr 2011 13:48:27 +0200 Subject: [ID3 Dev] Encoding UTF-16 (i.e UTF-16 with BOM which is the most compatible choice Little Endian or Big Endian ?) In-Reply-To: <4DB7ED1A.9060407@fastmail.fm> References: <4DB7ED1A.9060407@fastmail.fm> Message-ID: Shouldn't matter. The BOM indicates which convention was used, and so any decent decoder should be able to handle both UTF-16 types... Best regards, Pyt. On Wed, Apr 27, 2011 at 12:16, Paul Taylor wrote: > When encoding string as UTF-16 (i.e UTF with BOM) which is the most > compatible choice Little Endian or Big Endian, should you chnage default > based on what latform you are on ? > > i.e Encoding Beck as BOM Big Endian would be > FE FF 00 62 00 65 00 63 00 6B > > and BOM little Endian would be > FF FE 62 00 65 00 63 00 6B 00 > > Seems Big Endian might be better because wouldn't cause a load of > unsynchronization if unsynchronizations was enabled. > > But I know Macs and Pcs tend to default to the opposite for this sort of > stuff > > Paul > > --------------------------------------------------------------------- > To unsubscribe, e-mail: id3v2-unsubscribe at id3.org > For additional commands, e-mail: id3v2-help at id3.org > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul_t100 at fastmail.fm Wed Apr 27 04:53:37 2011 From: paul_t100 at fastmail.fm (Paul Taylor) Date: Wed, 27 Apr 2011 12:53:37 +0100 Subject: [ID3 Dev] Encoding UTF-16 (i.e UTF-16 with BOM which is the most compatible choice Little Endian or Big Endian ?) In-Reply-To: References: <4DB7ED1A.9060407@fastmail.fm> Message-ID: <4DB803C1.2060907@fastmail.fm> On 27/04/2011 12:48, Pierre-Yves Thoulon wrote: > Shouldn't matter. The BOM indicates which convention was used, and so > any decent decoder should be able to handle both UTF-16 types... > > Best regards, > Pyt. > Thanks but I really want to know the reality of the situation not the optimistic outlook. --------------------------------------------------------------------- To unsubscribe, e-mail: id3v2-unsubscribe at id3.org For additional commands, e-mail: id3v2-help at id3.org From pierre-yves.thoulon at centraliens.net Wed Apr 27 06:56:47 2011 From: pierre-yves.thoulon at centraliens.net (Pierre-Yves Thoulon) Date: Wed, 27 Apr 2011 15:56:47 +0200 Subject: [ID3 Dev] Encoding UTF-16 (i.e UTF-16 with BOM which is the most compatible choice Little Endian or Big Endian ?) In-Reply-To: <4DB80FC9.7030807@fastmail.fm> References: <4DB7ED1A.9060407@fastmail.fm> <4DB803C1.2060907@fastmail.fm> <4DB80FC9.7030807@fastmail.fm> Message-ID: > > That doesnt sound right, AFAIK ID3v23 supports UTF-16 BOM ,and the BOM can > be either LE or BE, why do you think it only supports BOM LE ? You're right, sorry, got confused with UTF-8 support... > > >> I ran a small experiment with iTunes (*not* a decent decoder...), which >> apparently does not like UTF-16 LE... (prints a "?" instead of UTF-16 >> characters) >> > Did you use a BOM or just UTF-16 LE, this isnt a valid option ? > I stand corrected. I thought I had written a UTF-16 but I hadn't (had to look at the raw file to figure it out). iTunes does accept UTF-16/LE with BOM (2.3 and 2.4) and UTF-16BE without BOM (2.4 only) for whatever little test I ran. Pyt. -------------- next part -------------- An HTML attachment was scrubbed... URL: