root/xz - xz - Root on GIT

root/xz

mirror of https://git.tukaani.org/xz.git synced 2026-04-24 00:58:05 +00:00

Author	SHA1	Message	Date
Jia Tan	97f9ba50b8	liblzma: Check HAVE_USABLE_CLMUL before omitting CRC32 table. This was split from the prior commit so it could be easily applied to the 5.4 branch. Closes: https://github.com/tukaani-project/xz/pull/77	2024-02-01 20:09:11 +08:00
Jia Tan	ca9015f4de	liblzma: Check HAVE_USABLE_CLMUL before omitting CRC64 table. If liblzma is configured with --disable-clmul-crc CFLAGS="-msse4.1 -mpclmul", then it will fail to compile because the generic version must be used but the CRC tables were not included.	2024-02-01 20:09:11 +08:00
Jia Tan	2f1552a91c	liblzma: Only use ifunc in crcXX_fast.c if its needed. The code was using HAVE_FUNC_ATTRIBUTE_IFUNC instead of CRC_USE_IFUNC. With ARM64, ifunc is incompatible because it requires non-inline function calls for runtime detection.	2024-02-01 20:09:11 +08:00
Jia Tan	30a25f3742	Docs: Add --disable-arm64-crc32 description to INSTALL.	2024-02-01 20:09:11 +08:00
Jia Tan	1940f0ec28	liblzma: Omit CRC tables when not needed with ARM64 optimizations. This is similar to the existing x86-64 CLMUL conditions to omit the tables. They were slightly refactored to improve readability.	2024-02-01 20:09:11 +08:00
Jia Tan	761f5b69a4	liblzma: Rename crc32_aarch64.h to crc32_arm64.h. Even though the proper name for the architecture is aarch64, this project uses ARM64 throughout. So the rename is for consistency. Additionally, crc32_arm64.h was slightly refactored for the following changes: * Added MSVC, FreeBSD, and macOS support in is_arch_extension_supported(). * crc32_arch_optimized() now checks the size when aligning the buffer. * crc32_arch_optimized() loop conditions were slightly modified to avoid both decrementing the size and incrementing the buffer pointer. * Use the intrinsic wrappers defined in <arm_acle.h> because GCC and Clang name them differently. * Minor spacing and comment changes.	2024-02-01 20:09:11 +08:00
Jia Tan	455a08609c	liblzma: Refactor crc_common.h. The CRC_GENERIC is now split into CRC32_GENERIC and CRC64_GENERIC, since the ARM64 optimizations will be different between CRC32 and CRC64. For the same reason, CRC_ARCH_OPTIMIZED is split into CRC32_ARCH_OPTIMIZED and CRC64_ARCH_OPTIMIZED. ifunc will only be used with x86-64 CLMUL because the runtime detection methods needed with ARM64 are not compatible with ifunc.	2024-02-01 20:09:11 +08:00
Jia Tan	61908e8160	CMake: Add support for ARM64 CRC32 instruction detection.	2024-02-01 20:09:11 +08:00
Jia Tan	c5f6d79cc9	Build: Add support for ARM64 CRC32 instruction detection. This adds --enable-arm64-crc32/--disable-arm64-crc32 (enabled by default) for using the ARM64 CRC32 instruction. This can be disabled if one knows the binary will never need to run on an ARM64 machine with this instruction extension.	2024-02-01 20:09:09 +08:00
Chenxi Mao	849d0f282a	Speed up CRC32 calculation on ARM64 The CRC32 instructions in ARM64 can calculate the CRC32 result for 8 bytes in a single operation, making the use of ARM64 instructions much faster compared to the general CRC32 algorithm. Optimized CRC32 will be enabled if ARM64 has CRC extension running on Linux. Signed-off-by: Chenxi Mao <chenxi.mao2013@gmail.com>	2024-01-27 21:49:26 +08:00
Jia Tan	b43c3e48bf	Bump version number for 5.5.1alpha. v5.5.1alpha	2024-01-26 19:05:51 +08:00
Jia Tan	c7a7ae1500	Add NEWS for 5.5.1alpha	2024-01-26 19:00:52 +08:00
Jia Tan	0ef8192e8d	Add NEWS for 5.4.6.	2024-01-26 18:54:24 +08:00
Lasse Collin	93de7e751d	Move doc/logo/xz-logo.png to "doc" and Doxygen footer to "doxygen". The footer isn't a complete HTML file so having it in the doxygen directory is a tiny bit clearer.	2024-01-24 20:00:57 +02:00
Jia Tan	00fa01698d	README: Add COPYING.CC-BY-SA-4.0 entry to section 1.1. The Overall documentation section (1.1) table spacing had to be adjusted since the filename was very long.	2024-01-25 01:39:35 +08:00
Jia Tan	e280470040	Build: Add the logo and license to the release.	2024-01-25 01:39:35 +08:00
Jia Tan	b1ee6cf259	COPYING: Add the license for the XZ logo.	2024-01-25 01:39:29 +08:00
Jia Tan	31293ae707	Doxygen: Added the XZ logo and copyright information. The PROJECT_LOGO field is now used to include the XZ logo. The footer of each page now lists the copyright information instead of the default footer. The license is also copied to statisfy the copyright and so the link in the documentation can be local.	2024-01-25 01:06:01 +08:00
Lasse Collin	6daa4d0ea4	xz: Use threaded mode by defaut (as if --threads=0 was used). This hopefully does more good than bad: + It's faster by default. + Only the threaded compressor creates files that can be decompressed in threaded mode. - Compression ratio is worse, usually not too much though. When it matters, -T1 must be used. - Memory usage increases. - Scripts that assume single-threaded mode but don't use -T1 will possibly use too much resources, for example, if they run multiple xz processes in parallel to compress multiple files. - Output from single-threaded and multi-threaded compressors differ but such changes could happen for other reasons too (they just haven't happened since 5.0.0).	2024-01-23 18:29:28 +02:00
Jia Tan	a2dd2dc8e5	CI: Use RISC-V filter when building with BCJ support.	2024-01-23 23:55:44 +08:00
Jia Tan	3060e1070b	Tests: Use smaller dictionary size in RISC-V test files.	2024-01-23 23:55:44 +08:00
Jia Tan	44ff2fa5c9	Tests: Skip RISC-V test files if decoder was not built.	2024-01-23 23:55:39 +08:00
Lasse Collin	6133a3f300	xz: Man page: Add more examples of LZMA2 options with BCJ filters.	2024-01-23 23:05:47 +08:00
Lasse Collin	50255feeaa	liblzma: RISC-V filter: Use byte-by-byte access. Not all RISC-V processors support fast unaligned access so it's better to read only one byte in the main loop. This can be faster even on x86-64 when compared to reading 32 bits at a time as half the time the address is only 16-bit aligned. The downside is larger code size on archs that do support fast unaligned access.	2024-01-23 23:05:47 +08:00
Jia Tan	db5eb5f563	xz: Update xz -lvv for RISC-V filter. Version 5.6.0 will be shown, even though upcoming alphas and betas will be able to support this filter. 5.6.0 looks nicer in the output and people shouldn't be encouraged to use an unstable version in production in any way.	2024-01-23 23:05:47 +08:00
Jia Tan	e2870db5be	Tests: Add two RISC-V Filter test files. These test files achieve 100% code coverage in src/liblzma/simple/riscv.c. They contain all of the instructions that should be filtered and a few cases that should not.	2024-01-23 23:05:47 +08:00
Jia Tan	b26a898693	xz: Update message in --long-help for RISC-V Filter.	2024-01-23 23:05:47 +08:00
Jia Tan	283f778908	xz: Update the man page for the RISC-V Filter. A special note was added to suggest using four-byte alignment when the compressed instruction extension is not present in a RISC-V binary.	2024-01-23 23:05:47 +08:00
Jia Tan	ac3691ccca	Tests: Add RISC-V Filter test in test_compress.sh.	2024-01-23 23:05:47 +08:00
Jia Tan	2959dbc735	liblzma: Update string_conversion.c to support RISC-V Filter.	2024-01-23 23:05:47 +08:00
Jia Tan	34372a5adb	CMake: Support RISC-V BCJ Filter for encoding and decoding.	2024-01-23 23:05:47 +08:00
Jia Tan	440a2eccb0	liblzma: Add RISC-V BCJ filter. The new Filter ID is 0x0B. Thanks to Chien Wong <m@xv97.com> for the initial version of the Filter, the xz CLI updates, and the Autotools build system modifications. Thanks to Igor Pavlov for his many contributions to the design of the filter.	2024-01-23 23:05:41 +08:00
Jia Tan	5540f4329b	Docs: Update .xz file format specification to 1.2.0. The new RISC-V filter was added to the specification, in addition to updating the specification URL.	2024-01-19 23:08:14 +08:00
Jia Tan	22d86192f8	xz: Update website URLs in the man pages.	2024-01-19 23:08:14 +08:00
Jia Tan	6b63c4c613	liblzma: Update website URL.	2024-01-19 23:08:14 +08:00
Jia Tan	fce4758018	Docs: Update website URLs.	2024-01-19 23:08:14 +08:00
Jia Tan	c26812c5b2	Build: Update website URL.	2024-01-19 23:08:14 +08:00
Lasse Collin	fbb3ce541e	liblzma: CRC: Add a comment to crc_x86_clmul.h about BUILDING_ macros.	2024-01-11 15:25:00 +02:00
Lasse Collin	4f518c1b6b	liblzma: CRC: Remove crc_always_inline, use lzma_always_inline instead. Now crc_simd_body() in crc_x86_clmul.h is only called once in a translation unit, we no longer need to be so cautious about ensuring the always-inline behavior.	2024-01-11 15:24:35 +02:00
Lasse Collin	35c03ec6bf	liblzma: CRC: Update CLMUL comments to more generic wording.	2024-01-11 14:39:46 +02:00
Lasse Collin	66f080e801	liblzma: Rename arch-specific CRC functions and macros. CRC_CLMUL was split to CRC_ARCH_OPTIMIZED and CRC_X86_CLMUL. CRC_ARCH_OPTIMIZED is defined when an arch-optimized version is used. Currently the x86 CLMUL implementations are the only arch-optimized versions, and these also use the CRC_x86_CLMUL macro to tell when crc_x86_clmul.h needs to be included. is_clmul_supported() was renamed to is_arch_extension_supported(). crc32_clmul() and crc64_clmul() were renamed to crc32_arch_optimized() and crc64_arch_optimized(). This way the names make sense with arch-specific non-CLMUL implementations as well.	2024-01-11 14:29:42 +02:00
Lasse Collin	3dbed75b0b	liblzma: Fix a comment in crc_common.h.	2024-01-11 14:29:42 +02:00
Lasse Collin	419f55f9df	liblzma: Avoid extern lzma_crc32_clmul() and lzma_crc64_clmul(). A CLMUL-only build will have the crcxx_clmul() inlined into lzma_crcxx(). Previously a jump to the extern lzma_crcxx_clmul() was needed. Notes about shared liblzma on ELF platforms: - On platforms that support ifunc and -fvisibility=hidden, this was silly because CLMUL-only build would have that single extra jump instruction of extra overhead. - On platforms that support neither -fvisibility=hidden nor linker version script (liblzma*.map), jumping to lzma_crcxx_clmul() would go via PLT so a few more instructions of overhead (still not a big issue but silly nevertheless). There was a downside with static liblzma too: if an application only needs lzma_crc64(), static linking would make the linker include the CLMUL code for both CRC32 and CRC64 from crc_x86_clmul.o even though the CRC32 code wouldn't be needed, thus increasing code size of the executable (assuming that -ffunction-sections isn't used). Also, now compilers are likely to inline crc_simd_body() even if they don't support the always_inline attribute (or MSVC's __forceinline). Quite possibly all compilers that build the code do support such an attribute. But now it likely isn't a problem even if the attribute wasn't supported. Now all x86-specific stuff is in crc_x86_clmul.h. If other archs The other archs can then have their own headers with their own is_clmul_supported() and crcxx_clmul(). Another bonus is that the build system doesn't need to care if crc_clmul.c is needed. is_clmul_supported() stays as inline function as it's not needed when doing a CLMUL-only build (avoids a warning about unused function).	2024-01-11 14:29:42 +02:00
Lasse Collin	e3833e297d	liblzma: crc_clmul.c: Add crc_attr_target macro. This reduces the number of the complex #if directives.	2024-01-11 14:29:42 +02:00
Lasse Collin	d164ac0e62	liblzma: Simplify existing cases with lzma_attr_no_sanitize_address.	2024-01-11 14:29:42 +02:00
Lasse Collin	9523c1300d	liblzma: #define crc_attr_no_sanitize_address in crc_common.h.	2024-01-11 14:29:38 +02:00
Lasse Collin	93d144f093	liblzma: CRC: Add empty lines. And remove one too.	2024-01-10 17:19:03 +02:00
Lasse Collin	0c7e854ffd	liblzma: crc_clmul.c: Tidy up the location of MSVC pragma. It makes no difference in practice.	2024-01-10 17:19:03 +02:00
Lasse Collin	15cf3f04f2	Update THANKS.	2023-12-28 17:17:39 +02:00
Lasse Collin	cd64dd70d5	liblzma: Use 8-byte method in memcmplen.h on ARM64. It requires fast unaligned access to 64-bit integers and a fast instruction to count leading zeros in a 64-bit integer (__builtin_ctzll()). This perhaps should be enabled on some other archs too. Thanks to Chenxi Mao for the original patch: https://github.com/tukaani-project/xz/pull/75 (the first commit) According to the numbers there, this may improve encoding speed by about 3-5 %. This enables the 8-byte method on MSVC ARM64 too which should work but wasn't tested.	2023-12-28 17:17:39 +02:00

1 2 3 4 5 ...

2166 Commits