[ID3 Dev] Encoding UTF-16 (i.e UTF-16 with BOM which is the most compatible choice Little Endian or Big Endian ?)

Steve Dirickson sdirickson at real.com
Wed Apr 27 10:03:28 PDT 2011


I think the key is that UTF-16BE is equivalent to "network byte order". Any app that produces Unicode for external consumption really should provide the BOM. But, if it doesn't, the only reasonable assumption the recipient can make is that the text is in network byte order. The alternative is to try heuristics and look for lots of binary zero values (or lots of the same small value) in every other byte, and then make the call based on whether those recurring values are in the even or odd bytes.

From: py.thoulon at gmail.com [mailto:py.thoulon at gmail.com] On Behalf Of Pierre-Yves Thoulon
Sent: Wednesday, 27 April, 2011 6:57
To: id3v2
Subject: Re: [ID3 Dev] Encoding UTF-16 (i.e UTF-16 with BOM which is the most compatible choice Little Endian or Big Endian ?)

That doesnt sound right, AFAIK ID3v23 supports UTF-16 BOM ,and the BOM can be either LE or BE, why do you think it only supports BOM LE ?
You're right, sorry, got confused with UTF-8 support...


I ran a small experiment with iTunes (*not* a decent decoder...), which apparently does not like UTF-16 LE... (prints a "?" instead of UTF-16 characters)
Did you use a BOM or just UTF-16 LE, this isnt a valid option ?
I stand corrected. I thought I had written a UTF-16 but I hadn't (had to look at the raw file to figure it out). iTunes does accept UTF-16/LE with BOM (2.3 and 2.4) and UTF-16BE without BOM (2.4 only) for whatever little test I ran.

Pyt.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.id3.org/pipermail/id3v2/attachments/20110427/f76852d2/attachment.html>


More information about the ID3v2 mailing list