root/xz - xz - Root on GIT

Commit Graph

Author	SHA1	Message	Date
Lasse Collin	f3872a5947	liblzma: Optimize LZ decoder slightly. Now extra buffer space is reserved so that repeating bytes for any single match will never need to copy from two places (both the beginning and the end of the buffer). This simplifies dict_repeat() and helps a little with speed. This seems to reduce .lzma decompression time about 2 %, so with .xz and CRC it could be slightly less. The small things add up still.	2024-02-14 18:31:16 +02:00
Lasse Collin	eb518446e5	liblzma: LZMA decoder: Get rid of next_state[]. It's not completely obvious if this is better in the decoder. It should be good if compiler can avoid creating a branch (like using CMOV on x86). This also makes lzma_encoder.c use the new macros.	2024-02-14 18:31:16 +02:00
Lasse Collin	e0c0ee475c	liblzma: LZMA decoder improvements. This adds macros for bittree decoding which prepares the code for alternative C versions and inline assembly.	2024-02-14 18:31:16 +02:00
Jia Tan	de5c5e4176	liblzma: Creates Non-resumable and Resumable modes for lzma_decoder. The new decoder resumes the first decoder loop in the Resumable mode. Then, the code executes in Non-resumable mode until it detects that it cannot guarantee to have enough input/output to decode another symbol. The Resumable mode is how the decoder has always worked. Before decoding every input bit, it checks if there is enough space and will save its location to be resumed later. When the decoder has more input/output, it jumps back to the correct sequence in the Resumable mode code. When the input/output buffers are large, the Resumable mode is much slower than the Non-resumable because it has more branches and is harder for the compiler to optimize since it is in a large switch block. Early benchmarking shows significant time improvement (8-10% on gcc and clang x86) by using the Non-resumable code as much as possible.	2024-02-14 18:31:16 +02:00
Jia Tan	e446ab7a18	liblzma: Creates separate "safe" range decoder mode. The new "safe" range decoder mode is the same as old range decoder, but now the default behavior of the range decoder will not check if there is enough input or output to complete the operation. When the buffers are close to fully consumed, the "safe" operations must be used instead. This will improve speed because it will reduce the number of branches needed for most of the range decoder operations.	2024-02-14 18:31:16 +02:00
Lasse Collin	22af94128b	Add SPDX license identifier into 0BSD source code files.	2024-02-14 18:31:16 +02:00
Lasse Collin	689e0228ba	Change most public domain parts to 0BSD. Translations and doc/xz-file-format.txt and doc/lzma-file-format.txt were not touched. COPYING.0BSD was added.	2024-02-14 18:31:12 +02:00
Lasse Collin	33b8a24b66	liblzma: Add LZMA_FILTER_LZMA1EXT to support LZMA1 without end marker. Some file formats need support for LZMA1 streams that don't use the end of payload marker (EOPM) alias end of stream (EOS) marker. So far liblzma API has supported decompressing such streams via lzma_alone_decoder() when .lzma header specifies a known uncompressed size. Encoding support hasn't been available in the API. Instead of adding a new LZMA1-only API for this purpose, this commit adds a new filter ID for use with raw encoder and decoder. The main benefit of this approach is that then also filter chains are possible, for example, if someone wants to implement support for .7z files that use the x86 BCJ filter with LZMA1 (not BCJ2 as that isn't supported in liblzma).	2022-11-27 23:16:21 +02:00
Lasse Collin	9a304bf1e4	liblzma: Avoid unneeded use of void pointer in LZMA decoder.	2022-11-27 18:43:07 +02:00
Lasse Collin	218394958c	liblzma: Pass the Filter ID to LZ encoder and decoder. This allows using two Filter IDs with the same initialization function and data structures.	2022-11-27 18:20:33 +02:00
Lasse Collin	107c93ee5c	liblzma: Rename a variable and improve a comment.	2022-07-14 18:12:38 +03:00
Lasse Collin	9595a3119b	liblzma: Add optional autodetection of LZMA end marker. Turns out that this is needed for .lzma files as the spec in LZMA SDK says that end marker may be present even if the size is stored in the header. Such files are rare but exist in the real world. The code in liblzma is so old that the spec didn't exist in LZMA SDK back then and I had understood that such files weren't possible (the lzma tool in LZMA SDK didn't create such files). This modifies the internal API so that LZMA decoder can be told if EOPM is allowed even when the uncompressed size is known. It's allowed with .lzma and not with other uses. Thanks to Karl Beldan for reporting the problem.	2022-07-13 22:24:07 +03:00
Lasse Collin	7136f1735c	Rename unaligned_read32ne to read32ne, and similarly for the others.	2019-12-31 00:47:49 +02:00
Lasse Collin	dfac2c9a1d	liblzma: Fix warnings from -Wsign-conversion. Also, more parentheses were added to the literal_subcoder macro in lzma_comon.h (better style but no functional change in the current usage).	2019-06-23 21:38:56 +03:00
Lasse Collin	94e3f986aa	Fix or hide warnings from GCC 7's -Wimplicit-fallthrough.	2017-08-14 20:08:33 +03:00
Lasse Collin	d4a0462abe	liblzma: Avoid multiple definitions of lzma_coder structures. Only one definition was visible in a translation unit. It avoided a few casts and temp variables but seems that this hack doesn't work with link-time optimizations in compilers as it's not C99/C11 compliant. Fixes: http://www.mail-archive.com/xz-devel@tukaani.org/msg00279.html	2016-11-21 20:24:50 +02:00
Lasse Collin	3778db1be5	liblzma: Make the use of lzma_allocator const-correct. There is a tiny risk of causing breakage: If an application assigns lzma_stream.allocator to a non-const pointer, such code won't compile anymore. I don't know why anyone would do such a thing though, so in practice this shouldn't cause trouble. Thanks to Jan Kratochvil for the patch.	2012-07-17 18:19:59 +03:00
Lasse Collin	1403707fc6	liblzma: Check that the first byte of range encoded data is 0x00. It is just to be more pedantic and thus perhaps catch broken files slightly earlier.	2012-06-28 10:47:49 +03:00
Lasse Collin	974ebe6349	liblzma: Rename a few variables and constants. This has no semantic changes. I find the new names slightly more logical and they match the names that are already used in XZ Embedded. The name fastpos wasn't changed (not worth the hassle).	2010-10-26 10:36:41 +03:00
Lasse Collin	0076e03641	Clean up a few FIXMEs and TODOs. lzma_chunk_size() was commented out because it is currently useless.	2010-10-19 11:44:37 +03:00
Lasse Collin	eb7d51a3fa	Collection of language fixes to comments and docs. Thanks to Jonathan Nieder.	2010-02-12 13:16:15 +02:00
Lasse Collin	ebfb2c5e1f	Use a tuklib module for integer handling. This replaces bswap.h and integer.h. The tuklib module uses <byteswap.h> on GNU, <sys/endian.h> on *BSDs and <sys/byteorder.h> on Solaris, which may contain optimized code like inline assembly.	2009-10-04 22:57:12 +03:00
Lasse Collin	02ddf09bc3	Put the interesting parts of XZ Utils into the public domain. Some minor documentation cleanups were made at the same time.	2009-04-13 11:27:40 +03:00
Lasse Collin	f76e39cf93	Added initial support for preset dictionary for raw LZMA1 and LZMA2. It is not supported by the .xz format or the xz command line tool yet.	2009-01-27 18:36:05 +02:00
Lasse Collin	c596fda40b	Make the memusage functions of LZMA1 and LZMA2 decoders to validate the filter options.	2008-12-01 22:58:22 +02:00
Lasse Collin	1dcecfb09b	Some API changes, bug fixes, cleanups etc.	2008-09-27 19:09:21 +03:00
Lasse Collin	13a74b78e3	Renamed constants: - LZMA_VLI_VALUE_MAX -> LZMA_VLI_MAX - LZMA_VLI_VALUE_UNKNOWN -> LZMA_VLI_UNKNOWN - LZMA_HEADER_ERRRO -> LZMA_OPTIONS_ERROR	2008-09-13 12:10:43 +03:00
Lasse Collin	3b34851de1	Sort of garbage collection commit. :-\| Many things are still broken. API has changed a lot and it will still change a little more here and there. The command line tool doesn't have all the required changes to reflect the API changes, so it's easy to get "internal error" or trigger assertions.	2008-08-28 22:53:15 +03:00
Lasse Collin	0809c46534	Add limit of lc + lp <= 4. Now we can allocate the literal coder as part of the main LZMA encoder or decoder structure. Make the LZMA decoder to rely on the current internal API to free the allocated memory in case an error occurs.	2008-06-19 16:35:08 +03:00
Lasse Collin	7d17818cec	Update the code to mostly match the new simpler file format specification. Simplify things by removing most of the support for known uncompressed size in most places. There are some miscellaneous changes here and there too. The API of liblzma has got many changes and still some more will be done soon. While most of the code has been updated, some things are not fixed (the command line tool will choke with invalid filter chain, if nothing else). Subblock filter is somewhat broken for now. It will be updated once the encoded format of the Subblock filter has been decided.	2008-06-18 18:02:10 +03:00
Lasse Collin	7521bbdc83	Update a comment to use the variable name rep_len_decoder. (And BTW, the previous commit actually did change the program logic slightly.)	2008-03-22 01:26:36 +02:00
Lasse Collin	63b74d000e	Demystified the "state" variable in LZMA code. Use the word literal instead of char for better consistency. There are still some names with _char instead of _literal in lzma_optimum, these may be changed later. Renamed length coder variables. This commit doesn't change the program logic.	2008-03-22 00:57:33 +02:00
Lasse Collin	bfde3b24a5	Apply a minor speed optimization to LZMA decoder.	2008-03-11 15:35:34 +02:00
Lasse Collin	c0e19e0662	Remove two redundant validity checks from the LZMA decoder. These are already checked elsewhere, so omitting these gives (very) tiny speed up.	2008-02-28 10:24:31 +02:00
Lasse Collin	3599dba957	More fixes to LZMA decoder's flush marker handling.	2008-01-14 11:54:56 +02:00
Lasse Collin	d160ee3259	Another bug fix for flush marker detection.	2008-01-05 01:20:24 +02:00
Lasse Collin	fc67f79f60	Fix stupid bugs in flush marker detection.	2008-01-04 21:37:01 +02:00
Lasse Collin	0029cbbabe	Added support for flush marker, which will be in files that use LZMA_SYNC_FLUSH with encoder (not implemented yet). This is a new feature in the raw LZMA format, which isn't supported by old decoders. This shouldn't be a problem in practice, since lzma_alone_encoder() will not allow LZMA_SYNC_FLUSH, and thus not allow creating files on decodable with old decoders. Made lzma_decoder.c to require tab width of 4 characters if one wants to fit the code in 80 columns. This makes the code easier to read.	2008-01-04 21:30:33 +02:00
Lasse Collin	bbfd1f6ab0	Moved range decoder initialization (reading the first five input bytes) from LZMA decoder to range decoder header. Did the same for decoding of direct bits.	2008-01-04 20:45:05 +02:00
Lasse Collin	5d018dc035	Imported to git.	2007-12-09 00:42:33 +02:00

40 Commits