root/xz - xz - Root on GIT

root/xz

mirror of https://git.tukaani.org/xz.git synced 2026-04-25 01:28:00 +00:00

Author	SHA1	Message	Date
Lasse Collin	eb518446e5	liblzma: LZMA decoder: Get rid of next_state[]. It's not completely obvious if this is better in the decoder. It should be good if compiler can avoid creating a branch (like using CMOV on x86). This also makes lzma_encoder.c use the new macros.	2024-02-14 18:31:16 +02:00
Lasse Collin	e0c0ee475c	liblzma: LZMA decoder improvements. This adds macros for bittree decoding which prepares the code for alternative C versions and inline assembly.	2024-02-14 18:31:16 +02:00
Jia Tan	de5c5e4176	liblzma: Creates Non-resumable and Resumable modes for lzma_decoder. The new decoder resumes the first decoder loop in the Resumable mode. Then, the code executes in Non-resumable mode until it detects that it cannot guarantee to have enough input/output to decode another symbol. The Resumable mode is how the decoder has always worked. Before decoding every input bit, it checks if there is enough space and will save its location to be resumed later. When the decoder has more input/output, it jumps back to the correct sequence in the Resumable mode code. When the input/output buffers are large, the Resumable mode is much slower than the Non-resumable because it has more branches and is harder for the compiler to optimize since it is in a large switch block. Early benchmarking shows significant time improvement (8-10% on gcc and clang x86) by using the Non-resumable code as much as possible.	2024-02-14 18:31:16 +02:00
Jia Tan	e446ab7a18	liblzma: Creates separate "safe" range decoder mode. The new "safe" range decoder mode is the same as old range decoder, but now the default behavior of the range decoder will not check if there is enough input or output to complete the operation. When the buffers are close to fully consumed, the "safe" operations must be used instead. This will improve speed because it will reduce the number of branches needed for most of the range decoder operations.	2024-02-14 18:31:16 +02:00
Lasse Collin	b941549573	liblzma: Include the SPDX license identifier 0BSD to generated files. Perhaps the generated files aren't even copyrightable but using the same license for them as for the rest of the liblzma keeps things more consistent for tools that look for license info.	2024-02-14 18:31:16 +02:00
Lasse Collin	8e4ec79483	liblzma: Fix compilation of price_tablegen.c. It is built and run only manually so this didn't matter unless one wanted to regenerate the price_table.c.	2024-02-14 18:31:16 +02:00
Lasse Collin	22af94128b	Add SPDX license identifier into 0BSD source code files.	2024-02-14 18:31:16 +02:00
Lasse Collin	23de53421e	liblzma: Sync the AUTHORS fix about SHA-256 to lzma.h.	2024-02-14 18:31:16 +02:00
Lasse Collin	689e0228ba	Change most public domain parts to 0BSD. Translations and doc/xz-file-format.txt and doc/lzma-file-format.txt were not touched. COPYING.0BSD was added.	2024-02-14 18:31:12 +02:00
Lasse Collin	76946dc433	Fix SHA-256 authors. The initial commit 5d018dc03549c1ee4958364712fb0c94e1bf2741 in 2007 had a comment in sha256.c that the code is based on Crypto++ Library 5.5.1. In 2009 the Authors list in sha256.c and the AUTHORS file was updated with information that the code had come from Crypto++ but via 7-Zip. I know I had viewed 7-Zip's SHA-256 code but back then the C code has been identical enough with Crypto++, so I don't why I thought the author info would need that extra step via 7-Zip for this single file. Another error is that I had mixed sha.* and shacal2.* files when checking for author info in Crypto++. The shacal2.* files aren't related to liblzma's sha256.c and thus Kevin Springle's code in Crypto++ isn't either.	2024-02-14 15:23:00 +02:00
Jia Tan	45663443eb	liblzma: Fix build error if only RISC-V BCJ filter is enabled. If any other BCJ filter was enabled for encoding or decoding, then this was not a problem.	2024-02-13 23:33:21 +08:00
Jia Tan	adb073da76	liblzma: Fix typo discovered by codespell.	2024-02-09 23:59:54 +08:00
Jia Tan	7f68a68c19	liblzma: Update Authors list in crc32_arm64.h.	2024-02-02 01:38:51 +08:00
Jia Tan	97f9ba50b8	liblzma: Check HAVE_USABLE_CLMUL before omitting CRC32 table. This was split from the prior commit so it could be easily applied to the 5.4 branch. Closes: https://github.com/tukaani-project/xz/pull/77	2024-02-01 20:09:11 +08:00
Jia Tan	ca9015f4de	liblzma: Check HAVE_USABLE_CLMUL before omitting CRC64 table. If liblzma is configured with --disable-clmul-crc CFLAGS="-msse4.1 -mpclmul", then it will fail to compile because the generic version must be used but the CRC tables were not included.	2024-02-01 20:09:11 +08:00
Jia Tan	2f1552a91c	liblzma: Only use ifunc in crcXX_fast.c if its needed. The code was using HAVE_FUNC_ATTRIBUTE_IFUNC instead of CRC_USE_IFUNC. With ARM64, ifunc is incompatible because it requires non-inline function calls for runtime detection.	2024-02-01 20:09:11 +08:00
Jia Tan	1940f0ec28	liblzma: Omit CRC tables when not needed with ARM64 optimizations. This is similar to the existing x86-64 CLMUL conditions to omit the tables. They were slightly refactored to improve readability.	2024-02-01 20:09:11 +08:00
Jia Tan	761f5b69a4	liblzma: Rename crc32_aarch64.h to crc32_arm64.h. Even though the proper name for the architecture is aarch64, this project uses ARM64 throughout. So the rename is for consistency. Additionally, crc32_arm64.h was slightly refactored for the following changes: * Added MSVC, FreeBSD, and macOS support in is_arch_extension_supported(). * crc32_arch_optimized() now checks the size when aligning the buffer. * crc32_arch_optimized() loop conditions were slightly modified to avoid both decrementing the size and incrementing the buffer pointer. * Use the intrinsic wrappers defined in <arm_acle.h> because GCC and Clang name them differently. * Minor spacing and comment changes.	2024-02-01 20:09:11 +08:00
Jia Tan	455a08609c	liblzma: Refactor crc_common.h. The CRC_GENERIC is now split into CRC32_GENERIC and CRC64_GENERIC, since the ARM64 optimizations will be different between CRC32 and CRC64. For the same reason, CRC_ARCH_OPTIMIZED is split into CRC32_ARCH_OPTIMIZED and CRC64_ARCH_OPTIMIZED. ifunc will only be used with x86-64 CLMUL because the runtime detection methods needed with ARM64 are not compatible with ifunc.	2024-02-01 20:09:11 +08:00
Chenxi Mao	849d0f282a	Speed up CRC32 calculation on ARM64 The CRC32 instructions in ARM64 can calculate the CRC32 result for 8 bytes in a single operation, making the use of ARM64 instructions much faster compared to the general CRC32 algorithm. Optimized CRC32 will be enabled if ARM64 has CRC extension running on Linux. Signed-off-by: Chenxi Mao <chenxi.mao2013@gmail.com>	2024-01-27 21:49:26 +08:00
Jia Tan	b43c3e48bf	Bump version number for 5.5.1alpha.	2024-01-26 19:05:51 +08:00
Lasse Collin	50255feeaa	liblzma: RISC-V filter: Use byte-by-byte access. Not all RISC-V processors support fast unaligned access so it's better to read only one byte in the main loop. This can be faster even on x86-64 when compared to reading 32 bits at a time as half the time the address is only 16-bit aligned. The downside is larger code size on archs that do support fast unaligned access.	2024-01-23 23:05:47 +08:00
Jia Tan	2959dbc735	liblzma: Update string_conversion.c to support RISC-V Filter.	2024-01-23 23:05:47 +08:00
Jia Tan	440a2eccb0	liblzma: Add RISC-V BCJ filter. The new Filter ID is 0x0B. Thanks to Chien Wong <m@xv97.com> for the initial version of the Filter, the xz CLI updates, and the Autotools build system modifications. Thanks to Igor Pavlov for his many contributions to the design of the filter.	2024-01-23 23:05:41 +08:00
Jia Tan	6b63c4c613	liblzma: Update website URL.	2024-01-19 23:08:14 +08:00
Lasse Collin	fbb3ce541e	liblzma: CRC: Add a comment to crc_x86_clmul.h about BUILDING_ macros.	2024-01-11 15:25:00 +02:00
Lasse Collin	4f518c1b6b	liblzma: CRC: Remove crc_always_inline, use lzma_always_inline instead. Now crc_simd_body() in crc_x86_clmul.h is only called once in a translation unit, we no longer need to be so cautious about ensuring the always-inline behavior.	2024-01-11 15:24:35 +02:00
Lasse Collin	35c03ec6bf	liblzma: CRC: Update CLMUL comments to more generic wording.	2024-01-11 14:39:46 +02:00
Lasse Collin	66f080e801	liblzma: Rename arch-specific CRC functions and macros. CRC_CLMUL was split to CRC_ARCH_OPTIMIZED and CRC_X86_CLMUL. CRC_ARCH_OPTIMIZED is defined when an arch-optimized version is used. Currently the x86 CLMUL implementations are the only arch-optimized versions, and these also use the CRC_x86_CLMUL macro to tell when crc_x86_clmul.h needs to be included. is_clmul_supported() was renamed to is_arch_extension_supported(). crc32_clmul() and crc64_clmul() were renamed to crc32_arch_optimized() and crc64_arch_optimized(). This way the names make sense with arch-specific non-CLMUL implementations as well.	2024-01-11 14:29:42 +02:00
Lasse Collin	3dbed75b0b	liblzma: Fix a comment in crc_common.h.	2024-01-11 14:29:42 +02:00
Lasse Collin	419f55f9df	liblzma: Avoid extern lzma_crc32_clmul() and lzma_crc64_clmul(). A CLMUL-only build will have the crcxx_clmul() inlined into lzma_crcxx(). Previously a jump to the extern lzma_crcxx_clmul() was needed. Notes about shared liblzma on ELF platforms: - On platforms that support ifunc and -fvisibility=hidden, this was silly because CLMUL-only build would have that single extra jump instruction of extra overhead. - On platforms that support neither -fvisibility=hidden nor linker version script (liblzma*.map), jumping to lzma_crcxx_clmul() would go via PLT so a few more instructions of overhead (still not a big issue but silly nevertheless). There was a downside with static liblzma too: if an application only needs lzma_crc64(), static linking would make the linker include the CLMUL code for both CRC32 and CRC64 from crc_x86_clmul.o even though the CRC32 code wouldn't be needed, thus increasing code size of the executable (assuming that -ffunction-sections isn't used). Also, now compilers are likely to inline crc_simd_body() even if they don't support the always_inline attribute (or MSVC's __forceinline). Quite possibly all compilers that build the code do support such an attribute. But now it likely isn't a problem even if the attribute wasn't supported. Now all x86-specific stuff is in crc_x86_clmul.h. If other archs The other archs can then have their own headers with their own is_clmul_supported() and crcxx_clmul(). Another bonus is that the build system doesn't need to care if crc_clmul.c is needed. is_clmul_supported() stays as inline function as it's not needed when doing a CLMUL-only build (avoids a warning about unused function).	2024-01-11 14:29:42 +02:00
Lasse Collin	e3833e297d	liblzma: crc_clmul.c: Add crc_attr_target macro. This reduces the number of the complex #if directives.	2024-01-11 14:29:42 +02:00
Lasse Collin	d164ac0e62	liblzma: Simplify existing cases with lzma_attr_no_sanitize_address.	2024-01-11 14:29:42 +02:00
Lasse Collin	9523c1300d	liblzma: #define crc_attr_no_sanitize_address in crc_common.h.	2024-01-11 14:29:38 +02:00
Lasse Collin	93d144f093	liblzma: CRC: Add empty lines. And remove one too.	2024-01-10 17:19:03 +02:00
Lasse Collin	0c7e854ffd	liblzma: crc_clmul.c: Tidy up the location of MSVC pragma. It makes no difference in practice.	2024-01-10 17:19:03 +02:00
Lasse Collin	cd64dd70d5	liblzma: Use 8-byte method in memcmplen.h on ARM64. It requires fast unaligned access to 64-bit integers and a fast instruction to count leading zeros in a 64-bit integer (__builtin_ctzll()). This perhaps should be enabled on some other archs too. Thanks to Chenxi Mao for the original patch: https://github.com/tukaani-project/xz/pull/75 (the first commit) According to the numbers there, this may improve encoding speed by about 3-5 %. This enables the 8-byte method on MSVC ARM64 too which should work but wasn't tested.	2023-12-28 17:17:39 +02:00
Lasse Collin	12c90c00f0	liblzma: Check also for __clang__ in memcmplen.h. This change hopefully makes no practical difference as Clang likely was detected via __GNUC__ or _MSC_VER already.	2023-12-28 17:17:39 +02:00
Jia Tan	b34b6a9912	liblzma: Initialize lzma_lz_encoder pointers with NULL. This fixes the recent change to lzma_lz_encoder that used memzero instead of the NULL constant. On some compilers the NULL constant (always 0) may not equal the NULL pointer (this only needs to guarentee to not point to valid memory address). Later code compares the pointers to the NULL pointer so we must initialize them with the NULL pointer instead of 0 to guarentee code correctness.	2023-12-20 21:38:39 +08:00
Jia Tan	183a62f0b5	liblzma: Set all values in lzma_lz_encoder to NULL after allocation. The first member of lzma_lz_encoder doesn't necessarily need to be set to NULL since it will always be set before anything tries to use it. However the function pointer members must be set to NULL since other functions rely on this NULL value to determine if this behavior is supported or not. This fixes a somewhat serious bug, where the options_update() and set_out_limit() function pointers are not set to NULL. This seems to have been forgotten since these function pointers were added many years after the original two (code() and end()). The problem is that by not setting this to NULL we are relying on the memory allocation to zero things out if lzma_filters_update() is called on a LZMA1 encoder. The function pointer for set_out_limit() is less serious because there is not an API function that could call this in an incorrect way. set_out_limit() is only called by the MicroLZMA encoder, which must use LZMA1 where set_out_limit() is always set. Its currently not possible to call set_out_limit() on an LZMA2 encoder at this time. So calling lzma_filters_update() on an LZMA1 encoder had undefined behavior since its possible that memory could be manipulated so the options_update member pointed to a different instruction sequence. This is unlikely to be a bug in an existing application since it relies on calling lzma_filters_update() on an LZMA1 encoder in the first place. For instance, it does not affect xz because lzma_filters_update() can only be used when encoding to the .xz format. This is fixed by using memzero() to set all members of lzma_lz_encoder to NULL after it is allocated. This ensures this mistake will not occur here in the future if any additional function pointers are added.	2023-12-16 20:51:38 +08:00
Jia Tan	1a1bb381db	liblzma: Tweak a comment.	2023-12-16 20:30:55 +08:00
Jia Tan	55810780e0	liblzma: Make parameter names in function definition match declaration. lzma_raw_encoder() and lzma_raw_encoder_init() used "options" as the parameter name instead of "filters" (used by the declaration). "filters" is more clear since the parameter represents the list of filters passed to the raw encoder, each of which contains filter options.	2023-12-16 20:28:21 +08:00
Jia Tan	5dad6f628a	liblzma: Improve lzma encoder init function consistency. lzma_encoder_init() did not check for NULL options, but lzma2_encoder_init() did. This is more of a code style improvement than anything else to help make lzma_encoder_init() and lzma2_encoder_init() more similar.	2023-12-16 20:18:47 +08:00
Jia Tan	2ade7246e7	liblzma: Add missing comments to lz_encoder.h.	2023-11-09 01:21:53 +08:00
Lasse Collin	46007049cd	liblzma: Fix compilation of fastpos_tablegen.c. The macro lzma_attr_visibility_hidden has to be defined to make fastpos.h usable. The visibility attribute is irrelevant to fastpos_tablegen.c so simply #define the macro to an empty value. fastpos_tablegen.c is never built by the included build systems and so the problem wasn't noticed earlier. It's just a standalone program for generating fastpos_table.c. Fixes: https://github.com/tukaani-project/xz/pull/69 Thanks to GitHub user Jamaika1.	2023-10-31 21:41:09 +02:00
Lasse Collin	8c36ab79cb	liblzma: Add a note why crc_always_inline exists for now. Solaris Studio is a possible example (not tested) which supports the always_inline attribute but might not get detected by the common.h #ifdefs.	2023-10-30 18:44:32 +02:00
Lasse Collin	e7a86b94cd	liblzma: Use lzma_always_inline in memcmplen.h.	2023-10-30 18:44:32 +02:00
Lasse Collin	dcfe563299	liblzma: #define lzma_always_inline in common.h.	2023-10-30 18:44:32 +02:00
Lasse Collin	41113fe30a	liblzma: Use lzma_attr_visibility_hidden on private extern declarations. These variables are internal to liblzma and not exposed in the API.	2023-10-30 18:06:25 +02:00
Lasse Collin	a2f5ca706a	liblzma: #define lzma_attr_visibility_hidden in common.h. In ELF shared libs: -fvisibility=hidden affects definitions of symbols but not declarations.[] This doesn't affect direct calls to functions inside liblzma as a linker can replace a call to lzma_foo@plt with a call directly to lzma_foo when -fvisibility=hidden is used. [] It has to be like this because otherwise every installed header file would need to explictly set the symbol visibility to default. When accessing extern variables that aren't defined in the same translation unit, compiler assumes that the variable has the default visibility and thus indirection is needed. Unlike function calls, linker cannot optimize this. Using __attribute__((__visibility__("hidden"))) with the extern variable declarations tells the compiler that indirection isn't needed because the definition is in the same shared library. About 15+ years ago, someone told me that it would be good if the CRC tables would be defined in the same translation unit as the C code of the CRC functions. While I understood that it could help a tiny amount, I didn't want to change the code because a separate translation unit for the CRC tables was needed for the x86 assembly code anyway. But when visibility attributes are supported, simply marking the extern declaration with the hidden attribute will get identical result. When there are only a few affected variables, this is trivial to do. I wish I had understood this back then already.	2023-10-30 18:03:39 +02:00

1 2 3 4 5 ...

695 Commits