root/xz - xz - Root on GIT

root/xz

mirror of https://git.tukaani.org/xz.git synced 2026-04-01 05:38:01 +00:00

Author	SHA1	Message	Date
Lasse Collin	cd64dd70d5	liblzma: Use 8-byte method in memcmplen.h on ARM64. It requires fast unaligned access to 64-bit integers and a fast instruction to count leading zeros in a 64-bit integer (__builtin_ctzll()). This perhaps should be enabled on some other archs too. Thanks to Chenxi Mao for the original patch: https://github.com/tukaani-project/xz/pull/75 (the first commit) According to the numbers there, this may improve encoding speed by about 3-5 %. This enables the 8-byte method on MSVC ARM64 too which should work but wasn't tested.	2023-12-28 17:17:39 +02:00
Lasse Collin	12c90c00f0	liblzma: Check also for __clang__ in memcmplen.h. This change hopefully makes no practical difference as Clang likely was detected via __GNUC__ or _MSC_VER already.	2023-12-28 17:17:39 +02:00
Jia Tan	b34b6a9912	liblzma: Initialize lzma_lz_encoder pointers with NULL. This fixes the recent change to lzma_lz_encoder that used memzero instead of the NULL constant. On some compilers the NULL constant (always 0) may not equal the NULL pointer (this only needs to guarentee to not point to valid memory address). Later code compares the pointers to the NULL pointer so we must initialize them with the NULL pointer instead of 0 to guarentee code correctness.	2023-12-20 21:38:39 +08:00
Jia Tan	183a62f0b5	liblzma: Set all values in lzma_lz_encoder to NULL after allocation. The first member of lzma_lz_encoder doesn't necessarily need to be set to NULL since it will always be set before anything tries to use it. However the function pointer members must be set to NULL since other functions rely on this NULL value to determine if this behavior is supported or not. This fixes a somewhat serious bug, where the options_update() and set_out_limit() function pointers are not set to NULL. This seems to have been forgotten since these function pointers were added many years after the original two (code() and end()). The problem is that by not setting this to NULL we are relying on the memory allocation to zero things out if lzma_filters_update() is called on a LZMA1 encoder. The function pointer for set_out_limit() is less serious because there is not an API function that could call this in an incorrect way. set_out_limit() is only called by the MicroLZMA encoder, which must use LZMA1 where set_out_limit() is always set. Its currently not possible to call set_out_limit() on an LZMA2 encoder at this time. So calling lzma_filters_update() on an LZMA1 encoder had undefined behavior since its possible that memory could be manipulated so the options_update member pointed to a different instruction sequence. This is unlikely to be a bug in an existing application since it relies on calling lzma_filters_update() on an LZMA1 encoder in the first place. For instance, it does not affect xz because lzma_filters_update() can only be used when encoding to the .xz format. This is fixed by using memzero() to set all members of lzma_lz_encoder to NULL after it is allocated. This ensures this mistake will not occur here in the future if any additional function pointers are added.	2023-12-16 20:51:38 +08:00
Jia Tan	1a1bb381db	liblzma: Tweak a comment.	2023-12-16 20:30:55 +08:00
Jia Tan	55810780e0	liblzma: Make parameter names in function definition match declaration. lzma_raw_encoder() and lzma_raw_encoder_init() used "options" as the parameter name instead of "filters" (used by the declaration). "filters" is more clear since the parameter represents the list of filters passed to the raw encoder, each of which contains filter options.	2023-12-16 20:28:21 +08:00
Jia Tan	5dad6f628a	liblzma: Improve lzma encoder init function consistency. lzma_encoder_init() did not check for NULL options, but lzma2_encoder_init() did. This is more of a code style improvement than anything else to help make lzma_encoder_init() and lzma2_encoder_init() more similar.	2023-12-16 20:18:47 +08:00
Jia Tan	2ade7246e7	liblzma: Add missing comments to lz_encoder.h.	2023-11-09 01:21:53 +08:00
Lasse Collin	46007049cd	liblzma: Fix compilation of fastpos_tablegen.c. The macro lzma_attr_visibility_hidden has to be defined to make fastpos.h usable. The visibility attribute is irrelevant to fastpos_tablegen.c so simply #define the macro to an empty value. fastpos_tablegen.c is never built by the included build systems and so the problem wasn't noticed earlier. It's just a standalone program for generating fastpos_table.c. Fixes: https://github.com/tukaani-project/xz/pull/69 Thanks to GitHub user Jamaika1.	2023-10-31 21:41:09 +02:00
Lasse Collin	8c36ab79cb	liblzma: Add a note why crc_always_inline exists for now. Solaris Studio is a possible example (not tested) which supports the always_inline attribute but might not get detected by the common.h #ifdefs.	2023-10-30 18:44:32 +02:00
Lasse Collin	e7a86b94cd	liblzma: Use lzma_always_inline in memcmplen.h.	2023-10-30 18:44:32 +02:00
Lasse Collin	dcfe563299	liblzma: #define lzma_always_inline in common.h.	2023-10-30 18:44:32 +02:00
Lasse Collin	41113fe30a	liblzma: Use lzma_attr_visibility_hidden on private extern declarations. These variables are internal to liblzma and not exposed in the API.	2023-10-30 18:06:25 +02:00
Lasse Collin	a2f5ca706a	liblzma: #define lzma_attr_visibility_hidden in common.h. In ELF shared libs: -fvisibility=hidden affects definitions of symbols but not declarations.[] This doesn't affect direct calls to functions inside liblzma as a linker can replace a call to lzma_foo@plt with a call directly to lzma_foo when -fvisibility=hidden is used. [] It has to be like this because otherwise every installed header file would need to explictly set the symbol visibility to default. When accessing extern variables that aren't defined in the same translation unit, compiler assumes that the variable has the default visibility and thus indirection is needed. Unlike function calls, linker cannot optimize this. Using __attribute__((__visibility__("hidden"))) with the extern variable declarations tells the compiler that indirection isn't needed because the definition is in the same shared library. About 15+ years ago, someone told me that it would be good if the CRC tables would be defined in the same translation unit as the C code of the CRC functions. While I understood that it could help a tiny amount, I didn't want to change the code because a separate translation unit for the CRC tables was needed for the x86 assembly code anyway. But when visibility attributes are supported, simply marking the extern declaration with the hidden attribute will get identical result. When there are only a few affected variables, this is trivial to do. I wish I had understood this back then already.	2023-10-30 18:03:39 +02:00
Lasse Collin	2c7ee92e44	liblzma: Refer to MinGW-w64 instead of MinGW in the API headers. MinGW (formely a MinGW.org Project, later the MinGW.OSDN Project at <https://osdn.net/projects/mingw/>) has GCC 9.2.0 as the most recent GCC package (released 2021-02-02). The project might still be alive but majority of people have switched to MinGW-w64. Thus it seems clearer to refer to MinGW-w64 in our API headers too. Building with MinGW is likely to still work but I haven't tested it in the recent years.	2023-10-26 21:46:06 +03:00
Lasse Collin	a7d1b2825c	liblzma: Add Cflags.private to liblzma.pc.in for MSYS2. It properly adds -DLZMA_API_STATIC when compiling code that will be linked against static liblzma. Having it there on systems other than Windows does no harm. See: https://www.msys2.org/docs/pkgconfig/	2023-10-26 21:46:06 +03:00
Jia Tan	988e09f27b	liblzma: Move is_clmul_supported() back to crc_common.h. This partially reverts creating crc_clmul.c (8c0f9376f58c0696d5d6719705164d35542dd891) where is_clmul_supported() was moved, extern'ed, and renamed to lzma_is_clmul_supported(). This caused a problem when the function call to lzma_is_clmul_supported() results in a call through the PLT. ifunc resolvers run very early in the dynamic loading sequence, so the PLT may not be setup properly at this point. Whether the PLT is used or not for lzma_is_clmul_supported() depened upon the compiler-toolchain used and flags. In liblzma compiled with GCC, for instance, GCC will go through the PLT for function calls internal to liblzma if the version scripts and symbol visibility hiding are not used. If lazy-binding is disabled, then it would have made any program linked with liblzma fail during dynamic loading in the ifunc resolver.	2023-10-21 00:01:29 +08:00
Jia Tan	105c7ca90d	Build: Remove check for COND_CHECK_CRC32 in check/Makefile.inc. Currently crc32 is always enabled, so COND_CHECK_CRC32 must always be set. Because of this, it makes the recent change to conditionally compile check/crc_clmul.c appear wrong since that file has CLMUL implementations for both CRC32 and CRC64.	2023-10-19 16:23:32 +08:00
Jia Tan	c60b25569d	liblzma: Fix -fsanitize=address failure with crc_clmul functions. After forcing crc_simd_body() to always be inlined it caused -fsanitize=address to fail for lzma_crc32_clmul() and lzma_crc64_clmul(). The __no_sanitize_address__ attribute was added to lzma_crc32_clmul() and lzma_crc64_clmul(), but not removed from crc_simd_body(). ASAN and inline functions behavior has changed over the years for GCC specifically, so while strictly required we will keep __attribute__((__no_sanitize_address__)) on crc_simd_body() in case this becomes a requirement in the future. Older GCC versions refuse to inline a function with ASAN if the caller and callee do not agree on sanitization flags (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89124#c3). If the function was forced to be inlined, it will not compile if the callee function has __no_sanitize_address__ but the caller doesn't.	2023-10-19 01:15:20 +08:00
Jia Tan	1c8884f0af	liblzma: Set the MSVC optimization fix to only cover lzma_crc64_clmul(). After testing a 32-bit Release build on MSVC, only lzma_crc64_clmul() has the bug. crc_simd_body() and lzma_crc32_clmul() do not need the optimizations disabled.	2023-10-18 23:54:41 +08:00
Lasse Collin	5ce0f7a48b	liblzma: CRC_USE_GENERIC_FOR_SMALL_INPUTS cannot be used with ifunc.	2023-10-18 23:54:41 +08:00
Lasse Collin	2773538049	liblzma: Include common.h in crc_common.h. crc_common.h depends on common.h. The headers include common.h except when there is a reason to not do so.	2023-10-18 23:54:41 +08:00
Jia Tan	e13b7947b9	liblzma: Add include guards to crc_common.h.	2023-10-18 23:54:41 +08:00
Jia Tan	40abd88afc	liblzma: Add the crc_always_inline macro to crc_simd_body(). Forcing this to be inline has a significant speed improvement at the cost of a few repeated instructions. The compilers tested on did not inline this function since it is large and is used twice in the same translation unit.	2023-10-18 23:54:41 +08:00
Jia Tan	a5966c276b	liblzma: Create crc_always_inline macro. This macro must be used instead of the inline keyword. On MSVC, it is a replacement for __forceinline which is an MSVC specific keyword that should not be used with inline (it will issue a warning if it is). It does not use a build system check to determine if __attribute__((__always_inline__)) since all compilers that can use CLMUL extensions (except the special case for MSVC) should support this attribute. If this assumption is incorrect then it will result in a bug report instead of silently producing slow code.	2023-10-18 23:54:41 +08:00
Jia Tan	96b663f67c	liblzma: Refactor CRC comments. A detailed description of the three dispatch methods was added. Also, duplicated comments now only appear in crc32_fast.c or were removed from both crc32_fast.c and crc64_fast.c if they appeared in crc_clmul.c.	2023-10-18 23:54:41 +08:00
Jia Tan	8c0f9376f5	liblzma: Create crc_clmul.c. Both crc32_clmul() and crc64_clmul() are now exported from crc32_clmul.c as lzma_crc32_clmul() and lzma_crc64_clmul(). This ensures that is_clmul_supported() (now lzma_is_clmul_supported()) is not duplicated between crc32_fast.c and crc64_fast.c. Also, it encapsulates the complexity of the CLMUL implementations into a single file and reduces the complexity of crc32_fast.c and crc64_fast.c. Before, CLMUL code was present in crc32_fast.c, crc64_fast.c, and crc_common.h. During the conversion, various cleanups were applied to code (thanks to Lasse Collin) including: - Require using semicolons with MASK_/L/H/LH macros. - Variable typing and const handling improvements. - Improvements to comments. - Fixes to the pragmas used. - Removed unneeded variables. - Whitespace improvements. - Fixed CRC_USE_GENERIC_FOR_SMALL_INPUTS handling. - Silenced warnings and removed the need for some #pragmas	2023-10-18 23:54:36 +08:00
Jia Tan	a3ebc2c516	liblzma: Define CRC_USE_IFUNC in crc_common.h. When ifunc is supported, we can define a simpler macro instead of repeating the more complex check in both crc32_fast.c and crc64_fast.c.	2023-10-18 20:41:11 +08:00
Hans Jansen	f1cd9d7194	liblzma: Added crc32_clmul to crc32_fast.c.	2023-10-13 20:54:05 +08:00
Hans Jansen	93e6fb08b2	liblzma: Moved CLMUL CRC logic to crc_common.h. crc64_fast.c was updated to use the code from crc_common.h instead.	2023-10-13 20:54:05 +08:00
Hans Jansen	233885a437	liblzma: Rename crc_macros.h to crc_common.h.	2023-10-13 20:54:05 +08:00
Lasse Collin	5a9af95f85	liblzma: Update a comment. The C standards don't allow an empty translation unit which can be avoided by declaring something, without exporting any symbols. When I committed f644473a211394447824ea00518d0a214ff3f7f2 I had a feeling that some specific toolchain somewhere didn't like empty object files (assembler or maybe "ar" complained) but I cannot find anything to confirm this now. Quite likely I remembered nonsense. I leave this here as a note to my future self. :-)	2023-09-26 21:47:13 +03:00
Jia Tan	8ebaf3f665	liblzma: Avoid compiler warning without creating extra symbol. When the generic fast crc64 method is used, then we omit lzma_crc64_table[][]. Similar to d9166b52cf3458a4da3eb92224837ca8fc208d79, we can avoid compiler warnings with -Wempty-translation-unit (Clang) or -pedantic (GCC) by creating a never used typedef instead of an extra symbol.	2023-09-27 00:04:40 +08:00
Jia Tan	f6667702bf	liblzma: Change quoting style from `...' to '...'. This was done for both internal and API headers.	2023-09-24 22:09:47 +08:00
Lasse Collin	ee7709bae5	liblzma: Move a few __attribute__ uses in function declarations. The API headers have many attributes but these were left as is for now.	2023-09-22 20:06:27 +03:00
Lasse Collin	18a66fbac0	Remove incorrect uses of __attribute__((__malloc__)). xrealloc() is obviously incorrect, modern GCC docs even mention realloc() as an example where this attribute cannot be used. liblzma's lzma_alloc() and lzma_alloc_zero() would be correct uses most of the time but custom allocators may use a memory pool or otherwise hold the pointer so aliasing issues could happen in theory. The xstrdup() case likely was correct but I removed it anyway. Now there are no __malloc__ attributes left in the code. The allocations aren't in hot paths so this should make no practical difference.	2023-09-22 20:06:27 +03:00
Lasse Collin	4f44ef8675	liblzma: Mark crc64_clmul() with __attribute__((__no_sanitize_address__)). Thanks to Agostino Sarubbo. Fixes: https://github.com/tukaani-project/xz/issues/62	2023-09-14 16:34:07 +03:00
Jia Tan	721e3d9f7a	liblzma: Update assert in vli_ceil4(). The argument to vli_ceil4() should always guarantee the return value is also a valid lzma_vli. Thus the highest three valid lzma_vli values are invalid arguments. All uses of the function ensure this so the assert is updated to match this.	2023-08-28 23:05:34 +08:00
Jia Tan	ae5c07b22a	liblzma: Add overflow check for Unpadded size in lzma_index_append(). This was not a security bug since there was no path to overflow UINT64_MAX in lzma_index_append() or when it calls index_file_size(). The bug was discovered by a failing assert() in vli_ceil4() when called from index_file_size() when unpadded_sum (the sum of the compressed size of current Stream and the unpadded_size parameter) exceeds LZMA_VLI_MAX. Previously, the unpadded_size parameter was checked to be not greater than UNPADDED_SIZE_MAX, but no check was done once compressed_base was added. This could not have caused an integer overflow in index_file_size() when called by lzma_index_append(). The calculation for file_size breaks down into the sum of: - Compressed base from all previous Streams - 2 * LZMA_STREAM_HEADER_SIZE (size of the current Streams header and footer) - stream_padding (can be set by lzma_index_stream_padding()) - Compressed base from the current Stream - Unpadded size (parameter to lzma_index_append()) The sum of everything except for Unpadded size must be less than LZMA_VLI_MAX. This is guarenteed by overflow checks in the functions that can set these values including lzma_index_stream_padding(), lzma_index_append(), and lzma_index_cat(). The maximum value for Unpadded size is enforced by lzma_index_append() to be less than or equal UNPADDED_SIZE_MAX. Thus, the sum cannot exceed UINT64_MAX since LZMA_VLI_MAX is half of UINT64_MAX. Thanks to Joona Kannisto for reporting this.	2023-08-28 23:04:56 +08:00
Dimitri Papadopoulos Orfanos	42df7c7aa1	Docs: Fix typos found by codespell	2023-07-31 20:02:21 +08:00
Jia Tan	d9166b52cf	liblzma: Prevent an empty translation unit in Windows builds. To workaround Automake lacking Windows resource compiler support, an empty source file is compiled to overwrite the resource files for static library builds. Translation units without an external declaration are not allowed by the C standard and result in a warning when used with -Wempty-translation-unit (Clang) or -pedantic (GCC).	2023-07-24 23:11:13 +08:00
Jia Tan	0184d344fa	liblzma: Suppress -Wunused-function warning. Clang 16.0.0 and earlier have a bug that the ifunc resolver function triggers the -Wunused-function warning. The resolver function is static and only "used" by the __attribute__((__ifunc()__)). At this time, the bug is still unresolved, but has been reported: https://github.com/llvm/llvm-project/issues/63957 This is not a problem in GCC.	2023-07-19 23:36:00 +08:00
Jia Tan	43845fa70f	liblzma: Reword lzma_str_list_filters() documentation. This further improves the documentation from commit f36ca7982f6bd5e9827219ed4f3c5a1fbf5d7bdf. The previous wording of "supported options" was slightly misleading since the options that are printed are the ones that are relevant for encoding/decoding. It is not about which options can or must be specified.	2023-07-18 22:57:58 +08:00
Jia Tan	818701ba1c	liblzma: Improve comment in string_conversion.c. The comment used "flag" when referring to decoder options. Just referring to them as options is more clear and consistent.	2023-07-18 22:56:47 +08:00
Lasse Collin	97fd5cb669	liblzma: Tweak #if condition in memcmplen.h. Maybe ICC always #defines _MSC_VER on Windows but now it's very clear which code will get used.	2023-07-18 13:57:54 +03:00
Lasse Collin	40392c19f7	liblzma: Omit unnecessary parenthesis in a preprocessor directive.	2023-07-18 13:49:43 +03:00
Jia Tan	17f8844e6f	liblzma: Remove non-portable empty initializer. Commit 78704f36e74205857c898a351c757719a6c8b666 added an empty initializer {} to prevent a warning. The empty initializer is a GNU extension and results in a build failure on MSVC. The -wpedantic flag warns about empty initializers.	2023-07-08 21:24:19 +08:00
Jia Tan	78704f36e7	liblzma: Prevent uninitialzed warning in mt stream encoder. This change only impacts the compiler warning since it was impossible for the wait_abs struct in stream_encode_mt() to be used before it was initialized since mythread_condtime_set() will always be called before mythread_cond_timedwait(). Since the mythread.h code is different between the POSIX and Windows versions, this warning was only present on Windows builds. Thanks to Arthur S for reporting the warning and providing an initial patch.	2023-06-29 00:06:16 +08:00
Jia Tan	e3356a204c	liblzma: Prevent warning for MSYS2 Windows build. In lzma_memcmplen(), the <intrin.h> header file is only included if _MSC_VER and _M_X64 are both defined but _BitScanForward64() was previously used if _M_X64 was defined. GCC for MSYS2 defines _M_X64 but not _MSC_VER so _BitScanForward64() was used without including <intrin.h>. Now, lzma_memcmplen() will use __builtin_ctzll() for MSYS2 GCC builds as expected.	2023-06-28 23:59:51 +08:00
Lasse Collin	ee44863ae8	liblzma: Add ifunc implementation to crc64_fast.c. The ifunc method avoids indirection via the function pointer crc64_func. This works on GNU/Linux and probably on FreeBSD too. The previous __attribute((__constructor__)) method is kept for compatibility with ELF platforms which do support ifunc. The ifunc method has some limitations, for example, building liblzma with -fsanitize=address will result in segfaults. The configure option --disable-ifunc must be used for such builds. Thanks to Hans Jansen for the original patch. Closes: https://github.com/tukaani-project/xz/pull/53	2023-06-27 23:55:59 +08:00

1 2 3 4 5 ...

659 Commits