[ID3 Dev] Synchronised lyrics/text frames and Byte-Order-Markers
Mathias Kunter
mathiaskunter at yahoo.de
Wed Feb 10 00:01:56 PST 2010
Hi Martin,
I'd say the spec is clear about this: if the text encoding signals UTF-16, then each sync MUST have its individual BOM, since the specification requires that a BOM must always be present when encoding UTF-16 strings. Since each sync is stored as a terminated string, each string must have a BOM, and each string MAY use a different BOM.
It isn't recommended to mix big and little endian strings within the same frame, but your implementation should be able to handle different BOMs when decoding a SYLT frame (or any other frame with multiple UTF-16 encoded strings) - "Be conservative in what you do; be liberal in what you accept from
others."
> All syncs can have their own individual BOMs.
> This would be crazy - to say the least.
Why? Because it takes, let's say, 25.000 songs * 200 sync strings * 2 bytes BOM = around 10 MB disk space for an entire music collection? Storing album cover artwork takes much more space.
Best regards,
Mathias K.
________________________________
Von: Martin Benkert <martin.benkert at gmail.com>
An: id3v2 at id3.org
Gesendet: Montag, den 8. Februar 2010, 21:02:31 Uhr
Betreff: [ID3 Dev] Synchronised lyrics/text frames and Byte-Order-Markers
Hi,
the synchronized lyrics frame SLT/SYLT supports text encodings, and also
Byte-Order-Markers (BOMs).
It has a 'Content Descriptor' which is simply an encoded string and might
have a BOM, just like all other encoded frames.
But it also has a binary structure of items called 'syncs', which have
this structure
Terminated text to be synced (typically a syllable)
Sync identifier (terminator to above string) $00 (00)
Time stamp $xx (xx ...)
The first item is plain text. I guess it is encoded according to the
specified encoding of the frame.
But what about BOMs here? In a typical file there might be some hundred
syncs (if they are syllables). If they all have individual BOMs this will
be a lot of data. However, if they do not have BOMs at all, it is not
clear how for instance Little-Endian encoding might be specified.
There are four ways it might be intended to be
- syncs do not have any BOMs at all. This would be nice, but it implies
Big-Endian byte-order.
- the very first sync might have a BOM that is also applied to all
following syncs. This appears to be the best way. It also allows to
specify Little-Endian byte-order. But the ID3 standards do not specify
this in any way.
- all syncs can have their own individual BOMs. This would be crazy - to
say the least. But it would comply with the ID3 standards (at least in
an implicit way).
- the BOM used by 'Content Descriptor' is also used for the syncs. Might be
nice - but it beaks all conventions: Other frames with two encoded strings
(COM/COMM, COMR, GEO/GEOB, ULT/USLT) all might have individual BOMs for
text fields - only ID3v2.4 specifies 'All strings in the same frame SHALL
have the same byte order'.
This somehow indicates that the lyrics in a SLT/SYLT frame appears to be a
real oddity with respect to BOMs.
Question here is: Is there a recommended way to specify the byte-order of
the lyrics of synchronized lyrics frames SLT/SYLT?
Thanks
Martin
---------------------------------------------------------------------
To unsubscribe, e-mail: id3v2-unsubscribe at id3.org
For additional commands, e-mail: id3v2-help at id3.org
__________________________________________________
Do You Yahoo!?
Sie sind Spam leid? Yahoo! Mail verfügt über einen herausragenden Schutz gegen Massenmails.
http://mail.yahoo.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.id3.org/pipermail/id3v2/attachments/20100210/dd8fba1f/attachment.html>
More information about the ID3v2
mailing list