1
0
mirror of https://git.tukaani.org/xz.git synced 2025-04-22 23:50:51 +00:00

liblzma: Update lzma_lzip_decoder() docs about trailing data

Don't say that the .lz format allows trailing data. According to the
lzip 1.25 manual, trailing data isn't part of the file format at all.
However, tools are still expected to behave as usefully as possible
when there is trailing data.

Fix the description of lzip >= 1.20 behavior when some of the first
bytes of trailing data match the magic bytes. While the lzip 1.25 manual
recommends that none of the first four bytes in trailing data should
match the magic bytes, the default behavior of lzip 1.25 treats
trailing data as a corrupt member header only if two or three bytes
match the magic bytes; one matching byte isn't enough.

Reported-by: Antonio Diaz Diaz
Link: https://www.mail-archive.com/xz-devel@tukaani.org/msg00702.html
This commit is contained in:
Lasse Collin 2025-04-21 12:23:37 +03:00
parent c330220d47
commit 516b90f6e1
No known key found for this signature in database
GPG Key ID: 38EE757D69184620

View File

@ -862,18 +862,17 @@ extern LZMA_API(lzma_ret) lzma_alone_decoder(
* Just like with lzma_stream_decoder() for .xz files, LZMA_CONCATENATED
* should be used when decompressing normal standalone .lz files.
*
* The .lz format allows putting non-.lz data at the end of a file after at
* least one valid .lz member. That is, one can append custom data at the end
* of a .lz file and the decoder is required to ignore it. In liblzma this
* is relevant only when LZMA_CONCATENATED is used. In that case lzma_code()
* will return LZMA_STREAM_END and leave lzma_stream.next_in pointing to
* the first byte of the non-.lz data. An exception to this is if the first
* 1-3 bytes of the non-.lz data are identical to the .lz magic bytes
* (0x4C, 0x5A, 0x49, 0x50; "LZIP" in US-ASCII). In such a case the 1-3 bytes
* will have been ignored by lzma_code(). If one wishes to locate the non-.lz
* data reliably, one must ensure that the first byte isn't 0x4C. Actually
* one should ensure that none of the first four bytes of trailing data are
* equal to the magic bytes because lzip >= 1.20 requires it by default.
* If LZMA_CONCATENATED is used and there is non-.lz data after at least one
* valid .lz member, lzma_code() leaves lzma_stream.next_in pointing to the
* first byte of the non-.lz data and returns LZMA_STREAM_END. That is, one
* can append custom data at the end of a .lz file and the decoder will
* ignore it. An exception to this is if the first 1-3 bytes of the non-.lz
* data are identical to the .lz magic bytes (0x4C, 0x5A, 0x49, 0x50; "LZIP"
* in US-ASCII). In such a case the 1-3 bytes are consumed by lzma_code().
* If one wishes to locate the non-.lz data reliably, one must ensure that
* the first byte isn't 0x4C. It's best if none of the first four bytes of
* trailing data are equal to the magic bytes because if two or three bytes
* are, lzip >= 1.20 diagnoses it as a corrupt member header by default.
*
* \param strm Pointer to lzma_stream that is at least initialized
* with LZMA_STREAM_INIT.