Commit Graph

26 Commits

Author SHA1 Message Date
Lasse Collin f3872a5947 liblzma: Optimize LZ decoder slightly.
Now extra buffer space is reserved so that repeating bytes for
any single match will never need to copy from two places (both
the beginning and the end of the buffer). This simplifies
dict_repeat() and helps a little with speed.

This seems to reduce .lzma decompression time about 2 %, so
with .xz and CRC it could be slightly less. The small things
add up still.
2024-02-14 18:31:16 +02:00
Lasse Collin 22af94128b Add SPDX license identifier into 0BSD source code files. 2024-02-14 18:31:16 +02:00
Lasse Collin 689e0228ba Change most public domain parts to 0BSD.
Translations and doc/xz-file-format.txt and doc/lzma-file-format.txt
were not touched.

COPYING.0BSD was added.
2024-02-14 18:31:12 +02:00
Lasse Collin 5f22bd2d37 liblzma: Remove lzma_lz_decoder_uncompressed() as it's now unused. 2022-11-28 10:51:03 +02:00
Lasse Collin 218394958c liblzma: Pass the Filter ID to LZ encoder and decoder.
This allows using two Filter IDs with the same
initialization function and data structures.
2022-11-27 18:20:33 +02:00
Lasse Collin 9595a3119b liblzma: Add optional autodetection of LZMA end marker.
Turns out that this is needed for .lzma files as the spec in
LZMA SDK says that end marker may be present even if the size
is stored in the header. Such files are rare but exist in the
real world. The code in liblzma is so old that the spec didn't
exist in LZMA SDK back then and I had understood that such
files weren't possible (the lzma tool in LZMA SDK didn't
create such files).

This modifies the internal API so that LZMA decoder can be told
if EOPM is allowed even when the uncompressed size is known.
It's allowed with .lzma and not with other uses.

Thanks to Karl Beldan for reporting the problem.
2022-07-13 22:24:07 +03:00
Lasse Collin 608517b9b7 liblzma: Remove incorrect uses of lzma_attribute((__unused__)).
Caught by clang -Wused-but-marked-unused.
2019-06-24 22:50:36 +03:00
Lasse Collin 2a22de439e liblzma: Avoid memcpy(NULL, foo, 0) because it is undefined behavior.
I should have always known this but I didn't. Here is an example
as a reminder to myself:

    int mycopy(void *dest, void *src, size_t n)
    {
        memcpy(dest, src, n);
        return dest == NULL;
    }

In the example, a compiler may assume that dest != NULL because
passing NULL to memcpy() would be undefined behavior. Testing
with GCC 8.2.1, mycopy(NULL, NULL, 0) returns 1 with -O0 and -O1.
With -O2 the return value is 0 because the compiler infers that
dest cannot be NULL because it was already used with memcpy()
and thus the test for NULL gets optimized out.

In liblzma, if a null-pointer was passed to memcpy(), there were
no checks for NULL *after* the memcpy() call, so I cautiously
suspect that it shouldn't have caused bad behavior in practice,
but it's hard to be sure, and the problematic cases had to be
fixed anyway.

Thanks to Jeffrey Walton.
2019-05-13 20:05:17 +03:00
Antoine Cœur 2fb0ddaa55 spelling 2019-05-11 20:52:37 +03:00
Lasse Collin d4a0462abe liblzma: Avoid multiple definitions of lzma_coder structures.
Only one definition was visible in a translation unit.
It avoided a few casts and temp variables but seems that
this hack doesn't work with link-time optimizations in compilers
as it's not C99/C11 compliant.

Fixes:
http://www.mail-archive.com/xz-devel@tukaani.org/msg00279.html
2016-11-21 20:24:50 +02:00
Lasse Collin 3778db1be5 liblzma: Make the use of lzma_allocator const-correct.
There is a tiny risk of causing breakage: If an application
assigns lzma_stream.allocator to a non-const pointer, such
code won't compile anymore. I don't know why anyone would do
such a thing though, so in practice this shouldn't cause trouble.

Thanks to Jan Kratochvil for the patch.
2012-07-17 18:19:59 +03:00
Lasse Collin 4c6e146df9 Add underscores to attributes (__attribute((__foo__))). 2011-05-17 11:54:38 +03:00
Lasse Collin 920a69a8d8 Rename MIN() and MAX() to my_min() and my_max().
This should avoid some minor portability issues.
2010-05-26 10:36:46 +03:00
Lasse Collin e330fb7e6b Fix wrong indentation caused by incorrect settings
in the text editor.
2009-11-15 12:54:45 +02:00
Lasse Collin 02ddf09bc3 Put the interesting parts of XZ Utils into the public domain.
Some minor documentation cleanups were made at the same time.
2009-04-13 11:27:40 +03:00
Lasse Collin f76e39cf93 Added initial support for preset dictionary for raw LZMA1
and LZMA2. It is not supported by the .xz format or the xz
command line tool yet.
2009-01-27 18:36:05 +02:00
Lasse Collin 17781c2c20 The LZMA2 decoder fix introduced a bug to LZ decoder,
which made LZ decoder return too early after dictionary
reset. This fixes it.
2008-12-15 14:26:52 +02:00
Lasse Collin ff7fb2c605 Fix data corruption in LZMA2 decoder. 2008-12-15 10:01:59 +02:00
Lasse Collin 13d68b0698 LZ decoder cleanup 2008-09-13 13:54:00 +03:00
Lasse Collin 13a74b78e3 Renamed constants:
- LZMA_VLI_VALUE_MAX -> LZMA_VLI_MAX
  - LZMA_VLI_VALUE_UNKNOWN -> LZMA_VLI_UNKNOWN
  - LZMA_HEADER_ERRRO -> LZMA_OPTIONS_ERROR
2008-09-13 12:10:43 +03:00
Lasse Collin 3b34851de1 Sort of garbage collection commit. :-| Many things are still
broken. API has changed a lot and it will still change a
little more here and there. The command line tool doesn't
have all the required changes to reflect the API changes, so
it's easy to get "internal error" or trigger assertions.
2008-08-28 22:53:15 +03:00
Lasse Collin 7d17818cec Update the code to mostly match the new simpler file format
specification. Simplify things by removing most of the
support for known uncompressed size in most places.
There are some miscellaneous changes here and there too.

The API of liblzma has got many changes and still some
more will be done soon. While most of the code has been
updated, some things are not fixed (the command line tool
will choke with invalid filter chain, if nothing else).

Subblock filter is somewhat broken for now. It will be
updated once the encoded format of the Subblock filter
has been decided.
2008-06-18 18:02:10 +03:00
Lasse Collin f310c50286 Initialize the last byte of the dictionary to zero so that
lz_get_byte(lz, 0) returns zero. This was broken by
1a3b218598.
2008-03-11 15:17:16 +02:00
Lasse Collin 596fa1fac7 Always initialize lz->temp_size in lz_decoder.c. temp_size did
get initialized as a side-effect after allocating a new decoder,
but not when the decoder was reused.
2008-03-10 13:44:29 +02:00
Lasse Collin 1a3b218598 Don't memzero() the history buffer when initializing LZ
decoder. There's no danger of information leak here, so
it isn't required. Doing memzero() takes a lot of time
with large dictionaries, which could make it easier to
construct DoS attack to consume too much CPU time.
2008-02-02 14:51:06 +02:00
Lasse Collin 5d018dc035 Imported to git. 2007-12-09 00:42:33 +02:00