root/xz - xz - Root on GIT

root/xz

mirror of https://git.tukaani.org/xz.git synced 2026-03-14 05:38:03 +00:00

Author	SHA1	Message	Date
Lasse Collin	f3872a5947	liblzma: Optimize LZ decoder slightly. Now extra buffer space is reserved so that repeating bytes for any single match will never need to copy from two places (both the beginning and the end of the buffer). This simplifies dict_repeat() and helps a little with speed. This seems to reduce .lzma decompression time about 2 %, so with .xz and CRC it could be slightly less. The small things add up still.	2024-02-14 18:31:16 +02:00
Lasse Collin	eb518446e5	liblzma: LZMA decoder: Get rid of next_state[]. It's not completely obvious if this is better in the decoder. It should be good if compiler can avoid creating a branch (like using CMOV on x86). This also makes lzma_encoder.c use the new macros.	2024-02-14 18:31:16 +02:00
Lasse Collin	e0c0ee475c	liblzma: LZMA decoder improvements. This adds macros for bittree decoding which prepares the code for alternative C versions and inline assembly.	2024-02-14 18:31:16 +02:00
Jia Tan	de5c5e4176	liblzma: Creates Non-resumable and Resumable modes for lzma_decoder. The new decoder resumes the first decoder loop in the Resumable mode. Then, the code executes in Non-resumable mode until it detects that it cannot guarantee to have enough input/output to decode another symbol. The Resumable mode is how the decoder has always worked. Before decoding every input bit, it checks if there is enough space and will save its location to be resumed later. When the decoder has more input/output, it jumps back to the correct sequence in the Resumable mode code. When the input/output buffers are large, the Resumable mode is much slower than the Non-resumable because it has more branches and is harder for the compiler to optimize since it is in a large switch block. Early benchmarking shows significant time improvement (8-10% on gcc and clang x86) by using the Non-resumable code as much as possible.	2024-02-14 18:31:16 +02:00
Jia Tan	e446ab7a18	liblzma: Creates separate "safe" range decoder mode. The new "safe" range decoder mode is the same as old range decoder, but now the default behavior of the range decoder will not check if there is enough input or output to complete the operation. When the buffers are close to fully consumed, the "safe" operations must be used instead. This will improve speed because it will reduce the number of branches needed for most of the range decoder operations.	2024-02-14 18:31:16 +02:00
Lasse Collin	b941549573	liblzma: Include the SPDX license identifier 0BSD to generated files. Perhaps the generated files aren't even copyrightable but using the same license for them as for the rest of the liblzma keeps things more consistent for tools that look for license info.	2024-02-14 18:31:16 +02:00
Lasse Collin	22af94128b	Add SPDX license identifier into 0BSD source code files.	2024-02-14 18:31:16 +02:00
Lasse Collin	689e0228ba	Change most public domain parts to 0BSD. Translations and doc/xz-file-format.txt and doc/lzma-file-format.txt were not touched. COPYING.0BSD was added.	2024-02-14 18:31:12 +02:00
Jia Tan	5dad6f628a	liblzma: Improve lzma encoder init function consistency. lzma_encoder_init() did not check for NULL options, but lzma2_encoder_init() did. This is more of a code style improvement than anything else to help make lzma_encoder_init() and lzma2_encoder_init() more similar.	2023-12-16 20:18:47 +08:00
Lasse Collin	46007049cd	liblzma: Fix compilation of fastpos_tablegen.c. The macro lzma_attr_visibility_hidden has to be defined to make fastpos.h usable. The visibility attribute is irrelevant to fastpos_tablegen.c so simply #define the macro to an empty value. fastpos_tablegen.c is never built by the included build systems and so the problem wasn't noticed earlier. It's just a standalone program for generating fastpos_table.c. Fixes: https://github.com/tukaani-project/xz/pull/69 Thanks to GitHub user Jamaika1.	2023-10-31 21:41:09 +02:00
Lasse Collin	41113fe30a	liblzma: Use lzma_attr_visibility_hidden on private extern declarations. These variables are internal to liblzma and not exposed in the API.	2023-10-30 18:06:25 +02:00
Dimitri Papadopoulos Orfanos	42df7c7aa1	Docs: Fix typos found by codespell	2023-07-31 20:02:21 +08:00
Jia Tan	8f23657498	liblzma: Exports lzma_mt_block_size() as an API function. The lzma_mt_block_size() was previously just an internal function for the multithreaded .xz encoder. It is used to provide a recommended Block size for a given filter chain. This function is helpful to determine the maximum Block size for the multithreaded .xz encoder when one wants to change the filters between blocks. Then, this determined Block size can be provided to lzma_stream_encoder_mt() in the lzma_mt options parameter when intializing the coder. This requires one to know all the filter chains they are using before starting to encode (or at least the filter chain that will need the largest Block size), but that isn't a bad limitation.	2023-05-11 23:54:44 +08:00
Jia Tan	116e81f002	Build: Removes redundant check for LZMA1 filter support.	2023-03-23 21:48:52 +08:00
Lasse Collin	33b8a24b66	liblzma: Add LZMA_FILTER_LZMA1EXT to support LZMA1 without end marker. Some file formats need support for LZMA1 streams that don't use the end of payload marker (EOPM) alias end of stream (EOS) marker. So far liblzma API has supported decompressing such streams via lzma_alone_decoder() when .lzma header specifies a known uncompressed size. Encoding support hasn't been available in the API. Instead of adding a new LZMA1-only API for this purpose, this commit adds a new filter ID for use with raw encoder and decoder. The main benefit of this approach is that then also filter chains are possible, for example, if someone wants to implement support for .7z files that use the x86 BCJ filter with LZMA1 (not BCJ2 as that isn't supported in liblzma).	2022-11-27 23:16:21 +02:00
Lasse Collin	9a304bf1e4	liblzma: Avoid unneeded use of void pointer in LZMA decoder.	2022-11-27 18:43:07 +02:00
Lasse Collin	218394958c	liblzma: Pass the Filter ID to LZ encoder and decoder. This allows using two Filter IDs with the same initialization function and data structures.	2022-11-27 18:20:33 +02:00
Lasse Collin	3be88ae071	liblzma: Allow nice_len 2 and 3 even if match finder requires 3 or 4. That is, if the specified nice_len is smaller than the minimum of the match finder, silently use the match finder's minimum value instead of reporting an error. The old behavior is annoying to users and it complicates xz options handling too.	2022-11-24 23:23:55 +02:00
Lasse Collin	c392bf8ccb	liblzma: Fix infinite loop in LZMA encoder init with dict_size >= 2 GiB. The encoder doesn't support dictionary sizes larger than 1536 MiB. This is validated, for example, when calculating the memory usage via lzma_raw_encoder_memusage(). It is also enforced by the LZ part of the encoder initialization. However, LZMA encoder with LZMA_MODE_NORMAL did an unsafe calculation with dict_size before such validation and that results in an infinite loop if dict_size was 2 << 30 or greater.	2022-11-22 11:23:23 +02:00
Lasse Collin	107c93ee5c	liblzma: Rename a variable and improve a comment.	2022-07-14 18:12:38 +03:00
Lasse Collin	9595a3119b	liblzma: Add optional autodetection of LZMA end marker. Turns out that this is needed for .lzma files as the spec in LZMA SDK says that end marker may be present even if the size is stored in the header. Such files are rare but exist in the real world. The code in liblzma is so old that the spec didn't exist in LZMA SDK back then and I had understood that such files weren't possible (the lzma tool in LZMA SDK didn't create such files). This modifies the internal API so that LZMA decoder can be told if EOPM is allowed even when the uncompressed size is known. It's allowed with .lzma and not with other uses. Thanks to Karl Beldan for reporting the problem.	2022-07-13 22:24:07 +03:00
jiat75	6468f7e41a	liblzma: Add NULL checks to LZMA and LZMA2 properties encoders. Previously lzma_lzma_props_encode() and lzma_lzma2_props_encode() assumed that the options pointers must be non-NULL because the with these filters the API says it must never be NULL. It is good to do these checks anyway.	2022-02-07 00:20:01 +02:00
Lasse Collin	6c6f0db340	liblzma: Fix unitialized variable. This was introduced two weeks ago in the commit 625f4c7c99b2fcc4db9e7ab2deb4884790e2e17c. Thanks to Nathan Moinvaziri.	2021-01-29 21:19:08 +02:00
Lasse Collin	625f4c7c99	liblzma: Add rough support for output-size-limited encoding in LZMA1. With this it is possible to encode LZMA1 data without EOPM so that the encoder will encode as much input as it can without exceeding the specified output size limit. The resulting LZMA1 stream will be a normal LZMA1 stream without EOPM. The actual uncompressed size will be available to the caller via the uncomp_size pointer. One missing thing is that the LZMA layer doesn't inform the LZ layer when the encoding is finished and thus the LZ may read more input when it won't be used. However, this doesn't matter if encoding is done with a single call (which is the planned use case for now). For proper multi-call encoding this should be improved. This commit only adds the functionality for internal use. Nothing uses it yet.	2021-01-14 18:58:13 +02:00
Lasse Collin	b3ed19a55f	liblzma: Remove unneeded <sys/types.h> from fastpos_tablegen.c. This file only generates fastpos_table.c. It isn't built as a part of liblzma.	2020-02-24 23:23:18 +02:00
Lasse Collin	43dfe04e62	liblzma: Add more uses of lzma_memcmplen() to the normal mode of LZMA. This gives a tiny encoder speed improvement. This could have been done in 2014 after the commit 544aaa3d13554e8640f9caf7db717a96360ec0f6 but it was forgotten.	2020-02-21 17:40:02 +02:00
Lasse Collin	7136f1735c	Rename unaligned_read32ne to read32ne, and similarly for the others.	2019-12-31 00:47:49 +02:00
Lasse Collin	dfac2c9a1d	liblzma: Fix warnings from -Wsign-conversion. Also, more parentheses were added to the literal_subcoder macro in lzma_comon.h (better style but no functional change in the current usage).	2019-06-23 21:38:56 +03:00
Lasse Collin	33773c6f2a	liblzma: Use unaligned_readXXne functions instead of type punning. Now gcc -fsanitize=undefined should be clean. Thanks to Jeffrey Walton.	2019-06-01 19:01:21 +03:00
Lasse Collin	94e3f986aa	Fix or hide warnings from GCC 7's -Wimplicit-fallthrough.	2017-08-14 20:08:33 +03:00
Lasse Collin	d4a0462abe	liblzma: Avoid multiple definitions of lzma_coder structures. Only one definition was visible in a translation unit. It avoided a few casts and temp variables but seems that this hack doesn't work with link-time optimizations in compilers as it's not C99/C11 compliant. Fixes: http://www.mail-archive.com/xz-devel@tukaani.org/msg00279.html	2016-11-21 20:24:50 +02:00
Lasse Collin	f4c95ba94b	liblzma: Rename lzma_presets.c back to lzma_encoder_presets.c. It would be too annoying to update other build systems just because of this.	2015-11-03 20:55:45 +02:00
Lasse Collin	4cc584985c	Build: Build LZMA1/2 presets also when only decoder is wanted. People shouldn't rely on the presets when decoding raw streams, but xz uses the presets as the starting point for raw decoder options anyway. lzma_encocder_presets.c was renamed to lzma_presets.c to make it clear it's not used solely by the encoder code.	2015-11-03 18:06:40 +02:00
Lasse Collin	f243f5f44c	liblzma: Silence more uint32_t vs. size_t warnings.	2015-03-07 22:01:00 +02:00
Lasse Collin	117d962685	liblzma: Fix a compression-ratio regression in LZMA1/2 in fast mode. The bug was added in the commit f48fce093b07aeda95c18850f5e086d9f2383380 and thus affected 5.1.4beta and 5.2.0. Luckily the bug cannot cause data corruption or other nasty things.	2015-02-21 23:40:26 +02:00
Lasse Collin	544aaa3d13	liblzma: Use lzma_memcmplen() in normal mode of LZMA. Two locations were not changed yet because the simplest change assumes that the initial "len" may be greater than "limit".	2014-07-25 22:38:28 +03:00
Lasse Collin	f48fce093b	liblzma: Simplify LZMA fast mode code by using memcmp().	2014-07-25 22:30:38 +03:00
Lasse Collin	6bf5308e34	liblzma: Use lzma_memcmplen() in fast mode of LZMA.	2014-07-25 22:29:49 +03:00
Lasse Collin	a19d9e8575	liblzma: Avoid C99 compound literal arrays. MSVC 2013 doesn't like them. Maybe they aren't so good for readability either since many aren't used to them.	2014-01-12 16:44:52 +02:00
Lasse Collin	3778db1be5	liblzma: Make the use of lzma_allocator const-correct. There is a tiny risk of causing breakage: If an application assigns lzma_stream.allocator to a non-const pointer, such code won't compile anymore. I don't know why anyone would do such a thing though, so in practice this shouldn't cause trouble. Thanks to Jan Kratochvil for the patch.	2012-07-17 18:19:59 +03:00
Lasse Collin	1403707fc6	liblzma: Check that the first byte of range encoded data is 0x00. It is just to be more pedantic and thus perhaps catch broken files slightly earlier.	2012-06-28 10:47:49 +03:00
Lasse Collin	3e321a3acd	Remove doubled words from documentation and comments. Spot candidates by running these commands: git ls-files \|xargs perl -0777 -n \ -e 'while (/\b(then?\|[iao]n\|i[fst]\|but\|f?or\|at\|and\|[dt]o)\s+\1\b/gims)' \ -e '{$n=($` =~ tr/\n/\n/ + 1); ($v=$&)=~s/\n/\\n/g; print "$ARGV:$n:$v\n"}' Thanks to Jim Meyering for the original patch.	2011-04-12 11:59:49 +03:00
Lasse Collin	25fe729532	liblzma: Add the forgotten lzma_lzma2_block_size(). This should have been in 5eefc0086d24a65e136352f8c1d19cefb0cbac7a.	2011-04-11 21:15:07 +03:00
Lasse Collin	0d21f49a80	liblzma: Fix decoding of LZMA2 streams having no uncompressed data. The decoder considered empty LZMA2 streams to be corrupt. This shouldn't matter much with .xz files, because no encoder creates empty LZMA2 streams in .xz. This bug is more likely to cause problems in applications that use raw LZMA2 streams.	2011-03-31 11:54:48 +03:00
Lasse Collin	974ebe6349	liblzma: Rename a few variables and constants. This has no semantic changes. I find the new names slightly more logical and they match the names that are already used in XZ Embedded. The name fastpos wasn't changed (not worth the hassle).	2010-10-26 10:36:41 +03:00
Lasse Collin	0076e03641	Clean up a few FIXMEs and TODOs. lzma_chunk_size() was commented out because it is currently useless.	2010-10-19 11:44:37 +03:00
Lasse Collin	075257ab04	Fix the preset -3e. depth=0 was missing.	2010-09-26 18:10:31 +03:00
Lasse Collin	8fd3ac046d	Don't set lc=4 with --extreme. This should reduce the cases where --extreme makes compression worse. On the other hand, some other files may now benefit slightly less from --extreme.	2010-09-04 22:16:28 +03:00
Lasse Collin	b4b1cbcb53	Tweak the compression presets -0 .. -5. "Extreme" mode might need some further tweaking still. Docs were not updated yet.	2010-09-03 15:13:12 +03:00
Lasse Collin	920a69a8d8	Rename MIN() and MAX() to my_min() and my_max(). This should avoid some minor portability issues.	2010-05-26 10:36:46 +03:00

1 2 3

105 Commits