[ID3 Dev] Encoding UTF-16 (i.e UTF-16 with BOM which is the most compatible choice Little Endian or Big Endian ?)

Paul Taylor paul_t100 at fastmail.fm
Wed Apr 27 11:36:32 PDT 2011


On 27/04/2011 18:03, Steve Dirickson wrote:
>
> I think the key is that UTF-16BE is equivalent to "network byte 
> order". Any app that produces Unicode for external consumption really 
> should provide the BOM. But, if it doesn't, the only reasonable 
> assumption the recipient can make is that the text is in network byte 
> order. The alternative is to try heuristics and look for lots of 
> binary zero values (or lots of the same small value) in every other 
> byte, and then make the call based on whether those recurring values 
> are in the even or odd bytes.
>
I think you are missing my point Im NOT talking about the UTF-16BE 
encoding but the UTF-16 encoding (Which is UTF with BOM and can contain 
LE or BE data). I have no problem reading or writing the data but would 
like to know which is the most compatible choice. When embedded within 
an mp3 UTF-16 is not magically decoded by the operating system, it has 
to be decoded by the application, and Im sure there are some 
applications that can embed BOM LE but not BOM BE or vice versa. There 
is also the complication that the Byte order marks themselves in BOM LE  
requires unsynchronization if you are using unsynchronization whereas 
BOM BE does not, and applications such as Windows 7 Explorer itself 
don't understand unsynchronization making me think that BOM BE is more 
compatible.

Paul
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.id3.org/pipermail/id3v2/attachments/20110427/df9e8913/attachment.html>


More information about the ID3v2 mailing list