[ID3 Dev] curious problem with Cyrillic letters

Sergei Gerasenko gerases at publicschoolworks.com
Sat Dec 22 08:30:08 PST 2007


Hi everybody,

I just bought a Sony NWZ-A818 MP3 player and I have quite a music collection with cyrillic mp3 tags. After I transferred the collection to
the player, it displayed everything that was Cyrillic with question marks. So, I thought, bummer, the thing doesn't understand
Unicode. But as I was scrolling through the list, to my surprise I found a couple that were displayed correctly! I was very excited and
became determined to find out what was different in those files that the player displayed correctly.

I soon discovered that if I created a Cyrillic tag on Windows XP (using windows explorer), the player displayed the cyrillic
characters correctly.  If I took the same song and read the tag on Linux, the tag was also displayed correctly BUT if I edited and
saved it on Linux (using Quod Libet, which uses a Python id3 library), the tag turned into question marks both on Windows and
the player.

So I reduced the tag to as little as possible to find the difference. I just left the artist name and made it 3 letters long "Vоп". Then I
used a hex editor to read the tag. The only difference I saw was that Windows made the tag version to be 2.3, while the
Linux made it 2.4. 

Also, Windows made the length of the artist name frame (TPE1) "09" and the Linux version made it 2 bytes longer "0b". Otherwise they are
completely identical. Except also for a difference in the size of the whole tag: "1f76" and "0800" respectively.

Here are the corresponding hex dumps:

After editing on Windows
========================

sergei at ubuntu:~$ hexdump -Cn 200 song.mp3
00000000  49 44 33 03 00 00 00 00  1f 76 54 50 45 31 00 00  |ID3......vTPE1..|
00000010  00 09 00 00 01 ff fe 56  00 3e 04 3f 04 00 00 00  |.......V.>.?....|
00000020  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*


After editing on Linux
=======================
sergei at ubuntu:~$ hexdump -Cn 200 song.mp3 
00000000  49 44 33 04 00 00 00 00  08 00 54 50 45 31 00 00  |ID3.......TPE1..|
00000010  00 0b 00 00 01 ff fe 56  00 3e 04 3f 04 00 00 00  |.......V.>.?....|
00000020  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*

This is where my expertise stops! Can someone figure out what's going on here?

If anybody is interested, both files can be found below but my download speed is pretty bad. So please be patient.

http://65.27.155.246/~sergei/windows_version.mp3 and
http://65.27.155.246/~sergei/linux_version.mp3

Looking forward to your analysis!

Thanks,
    Sergei

---------------------------------------------------------------------
To unsubscribe, e-mail: id3v2-unsubscribe at id3.org
For additional commands, e-mail: id3v2-help at id3.org



More information about the ID3v2 mailing list