root/xz - xz - Root on GIT

root/xz

mirror of https://git.tukaani.org/xz.git synced 2026-04-30 03:58:04 +00:00

Author	SHA1	Message	Date
Lasse Collin	22af94128b	Add SPDX license identifier into 0BSD source code files.	2024-02-14 18:31:16 +02:00
Lasse Collin	689e0228ba	Change most public domain parts to 0BSD. Translations and doc/xz-file-format.txt and doc/lzma-file-format.txt were not touched. COPYING.0BSD was added.	2024-02-14 18:31:12 +02:00
Lasse Collin	50255feeaa	liblzma: RISC-V filter: Use byte-by-byte access. Not all RISC-V processors support fast unaligned access so it's better to read only one byte in the main loop. This can be faster even on x86-64 when compared to reading 32 bits at a time as half the time the address is only 16-bit aligned. The downside is larger code size on archs that do support fast unaligned access.	2024-01-23 23:05:47 +08:00
Jia Tan	440a2eccb0	liblzma: Add RISC-V BCJ filter. The new Filter ID is 0x0B. Thanks to Chien Wong <m@xv97.com> for the initial version of the Filter, the xz CLI updates, and the Autotools build system modifications. Thanks to Igor Pavlov for his many contributions to the design of the filter.	2024-01-23 23:05:41 +08:00
Lasse Collin	30e95bb44c	liblzma: Avoid null pointer + 0 (undefined behavior in C). In the C99 and C17 standards, section 6.5.6 paragraph 8 means that adding 0 to a null pointer is undefined behavior. As of writing, "clang -fsanitize=undefined" (Clang 15) diagnoses this. However, I'm not aware of any compiler that would take advantage of this when optimizing (Clang 15 included). It's good to avoid this anyway since compilers might some day infer that pointer arithmetic implies that the pointer is not NULL. That is, the following foo() would then unconditionally return 0, even for foo(NULL, 0): void bar(char a, char b); int foo(char *a, size_t n) { bar(a, a + n); return a == NULL; } In contrast to C, C++ explicitly allows null pointer + 0. So if the above is compiled as C++ then there is no undefined behavior in the foo(NULL, 0) call. To me it seems that changing the C standard would be the sane thing to do (just add one sentence) as it would ensure that a huge amount of old code won't break in the future. Based on web searches it seems that a large number of codebases (where null pointer + 0 occurs) are being fixed instead to be future-proof in case compilers will some day optimize based on it (like making the above foo(NULL, 0) return 0) which in the worst case will cause security bugs. Some projects don't plan to change it. For example, gnulib and thus many GNU tools currently require that null pointer + 0 is defined: https://lists.gnu.org/archive/html/bug-gnulib/2021-11/msg00000.html https://www.gnu.org/software/gnulib/manual/html_node/Other-portability-assumptions.html In XZ Utils null pointer + 0 issue should be fixed after this commit. This adds a few if-statements and thus branches to avoid null pointer + 0. These check for size > 0 instead of ptr != NULL because this way bugs where size > 0 && ptr == NULL will likely get caught quickly. None of them are in hot spots so it shouldn't matter for performance. A little less readable version would be replacing ptr + offset with offset != 0 ? ptr + offset : ptr or creating a macro for it: #define my_ptr_add(ptr, offset) \ ((offset) != 0 ? ((ptr) + (offset)) : (ptr)) Checking for offset != 0 instead of ptr != NULL allows GCC >= 8.1, Clang >= 7, and Clang-based ICX to optimize it to the very same code as ptr + offset. That is, it won't create a branch. So for hot code this could be a good solution to avoid null pointer + 0. Unfortunately other compilers like ICC 2021 or MSVC 19.33 (VS2022) will create a branch from my_ptr_add(). Thanks to Marcin Kowalczyk for reporting the problem: https://github.com/tukaani-project/xz/issues/36	2023-02-23 20:41:22 +02:00
Lasse Collin	8fd225a2c1	liblzma: Update authors list in arm64.c.	2022-12-16 18:30:02 +02:00
Lasse Collin	f9ca7d4516	liblzma: Omit zero-skipping from ARM64 filter. It has some complicated downsides and its usefulness is more limited than I originally thought. So this change is bad for certain very specific situations but a generic solution that works for other filters (and is otherwise better too) is planned anyway. And this way 7-Zip can use the same compatible filter for the .7z format. This is still marked as experimental with a new temporary Filter ID.	2022-12-01 18:55:00 +02:00
Lasse Collin	90caaded2d	liblzma: Omit simple coder init functions if they are disabled.	2022-11-25 18:04:37 +02:00
Lasse Collin	b56bc8251d	Revert "liblzma: Simple/BCJ filters: Allow disabling generic BCJ options." This reverts commit 177bdc922cb17bd0fd831ab8139dfae912a5c2b8 and also does equivalent change to arm64.c. Now that ARM64 filter will use lzma_options_bcj, this change is not needed anymore.	2022-11-14 23:19:57 +02:00
Lasse Collin	8370ec8edf	Replace the experimental ARM64 filter with a new experimental version. This is incompatible with the previous version. This has space/tab fixes in filter_*.c and bcj.h too.	2022-11-14 23:16:38 +02:00
Lasse Collin	f664cb2584	liblzma: ARM64: Add comments.	2022-09-20 16:58:22 +03:00
Lasse Collin	ecb966de30	liblzma: Add experimental ARM64 BCJ filter with a temporary Filter ID. That is, the Filter ID will be changed once the design is final. The current version will be removed. So files created with the tempoary Filter ID won't be supported in the future.	2022-09-19 20:23:46 +03:00
Lasse Collin	177bdc922c	liblzma: Simple/BCJ filters: Allow disabling generic BCJ options. This will be needed for the ARM64 BCJ filter as it will use its own options struct.	2022-09-17 22:42:18 +03:00
Lasse Collin	7136f1735c	Rename unaligned_read32ne to read32ne, and similarly for the others.	2019-12-31 00:47:49 +02:00
Lasse Collin	dfac2c9a1d	liblzma: Fix warnings from -Wsign-conversion. Also, more parentheses were added to the literal_subcoder macro in lzma_comon.h (better style but no functional change in the current usage).	2019-06-23 21:38:56 +03:00
Lasse Collin	2a22de439e	liblzma: Avoid memcpy(NULL, foo, 0) because it is undefined behavior. I should have always known this but I didn't. Here is an example as a reminder to myself: int mycopy(void dest, void src, size_t n) { memcpy(dest, src, n); return dest == NULL; } In the example, a compiler may assume that dest != NULL because passing NULL to memcpy() would be undefined behavior. Testing with GCC 8.2.1, mycopy(NULL, NULL, 0) returns 1 with -O0 and -O1. With -O2 the return value is 0 because the compiler infers that dest cannot be NULL because it was already used with memcpy() and thus the test for NULL gets optimized out. In liblzma, if a null-pointer was passed to memcpy(), there were no checks for NULL after the memcpy() call, so I cautiously suspect that it shouldn't have caused bad behavior in practice, but it's hard to be sure, and the problematic cases had to be fixed anyway. Thanks to Jeffrey Walton.	2019-05-13 20:05:17 +03:00
Lasse Collin	d4a0462abe	liblzma: Avoid multiple definitions of lzma_coder structures. Only one definition was visible in a translation unit. It avoided a few casts and temp variables but seems that this hack doesn't work with link-time optimizations in compilers as it's not C99/C11 compliant. Fixes: http://www.mail-archive.com/xz-devel@tukaani.org/msg00279.html	2016-11-21 20:24:50 +02:00
Lasse Collin	3778db1be5	liblzma: Make the use of lzma_allocator const-correct. There is a tiny risk of causing breakage: If an application assigns lzma_stream.allocator to a non-const pointer, such code won't compile anymore. I don't know why anyone would do such a thing though, so in practice this shouldn't cause trouble. Thanks to Jan Kratochvil for the patch.	2012-07-17 18:19:59 +03:00
Lasse Collin	d8db706acb	liblzma: Fix possibility of incorrect LZMA_BUF_ERROR. lzma_code() could incorrectly return LZMA_BUF_ERROR if all of the following was true: - The caller knows how many bytes of output to expect and only provides that much output space. - When the last output bytes are decoded, the caller-provided input buffer ends right before the LZMA2 end of payload marker. So LZMA2 won't provide more output anymore, but it won't know it yet and thus won't return LZMA_STREAM_END yet. - A BCJ filter is in use and it hasn't left any unfiltered bytes in the temp buffer. This can happen with any BCJ filter, but in practice it's more likely with filters other than the x86 BCJ. Another situation where the bug can be triggered happens if the uncompressed size is zero bytes and no output space is provided. In this case the decompression can fail even if the whole input file is given to lzma_code(). A similar bug was fixed in XZ Embedded on 2011-09-19.	2012-05-28 20:42:11 +03:00
Lasse Collin	aac1b31ea4	liblzma: Remove outdated comments.	2012-04-19 15:25:26 +03:00
Lasse Collin	4c6e146df9	Add underscores to attributes (__attribute((__foo__))).	2011-05-17 11:54:38 +03:00
Lasse Collin	eb7d51a3fa	Collection of language fixes to comments and docs. Thanks to Jonathan Nieder.	2010-02-12 13:16:15 +02:00
Lasse Collin	418d64a32e	Fix a design error in liblzma API. Originally the idea was that using LZMA_FULL_FLUSH with Stream encoder would read the filter chain from the same array that was used to intialize the Stream encoder. Since most apps wouldn't use LZMA_FULL_FLUSH, most apps wouldn't need to keep the filter chain available after initializing the Stream encoder. However, due to my mistake, it actually required keeping the array always available. Since setting the new filter chain via the array used at initialization time is not a nice way to do it for a couple of reasons, this commit ditches it and introduces lzma_filters_update(). This new function replaces also the "persistent" flag used by LZMA2 (and to-be-designed Subblock filter), which was also an ugly thing to do. Thanks to Alexey Tourbin for reminding me about the problem that Stream encoder used to require keeping the filter chain allocated.	2009-11-14 18:59:19 +02:00
Lasse Collin	ebfb2c5e1f	Use a tuklib module for integer handling. This replaces bswap.h and integer.h. The tuklib module uses <byteswap.h> on GNU, <sys/endian.h> on *BSDs and <sys/byteorder.h> on Solaris, which may contain optimized code like inline assembly.	2009-10-04 22:57:12 +03:00
Lasse Collin	cd69a5a6c1	BCJ filters: Reject invalid start offsets with LZMA_OPTIONS_ERROR. This is a quick and slightly dirty fix to make the code conform to the latest file format specification. Without this patch, it's possible to make corrupt files by specifying start offset that is not a multiple of the filter's alignment. Custom start offset is almost never used, so this was only a minor bug. The xz command line tool doesn't validate the start offset, so one will get a bit unclear error message if trying to use an invalid start offset.	2009-07-10 11:39:38 +03:00
Lasse Collin	f42ee98166	Build system fixes Don't use libtool convenience libraries to avoid recently discovered long-standing subtle but somewhat severe bugs in libtool (at least 1.5.22 and 2.2.6 are affected). It was found when porting XZ Utils to Windows <http://lists.gnu.org/archive/html/libtool/2009-06/msg00070.html> but the problem is significant also e.g. on GNU/Linux. Unless --disable-shared is passed to configure, static library built from a set of convenience libraries will contain PIC objects. That is, while libtool builds non-PIC objects too, only PIC objects will be used from the convenience libraries. On 32-bit x86 (tested on mobile XP2400+), using PIC instead of non-PIC makes the decompressor 10 % slower with the default CFLAGS. So while xz was linked against static liblzma by default, it got the slower PIC objects unless --disable-shared was used. I tend develop and benchmark with --disable-shared due to faster build time, so I hadn't noticed the problem in benchmarks earlier. This commit also adds support for building Windows resources into liblzma and executables.	2009-06-30 17:09:57 +03:00
Lasse Collin	1c9360b7d1	Fix @variables@ to $(variables) in Makefile.am files. Fix the ordering of libgnu.a and LTLIBINTL on the linker command line and added missing LTLIBINTL to tests/Makefile.am.	2009-06-26 14:47:31 +03:00
Lasse Collin	e518d167aa	Fix uint32_t -> size_t in ARM and ARM-Thumb filters. On 64-bit system it would have gone into infinite loop if a single input buffer was over 4 GiB (unlikely).	2009-04-15 14:13:38 +03:00
Lasse Collin	02ddf09bc3	Put the interesting parts of XZ Utils into the public domain. Some minor documentation cleanups were made at the same time.	2009-04-13 11:27:40 +03:00
Lasse Collin	322ecf93c9	Renamed lzma_options_simple to lzma_options_bcj in the API. The internal implementation is still using the name "simple". It may need some cleanups, so I look at it later.	2008-12-31 16:29:39 +02:00
Lasse Collin	13a74b78e3	Renamed constants: - LZMA_VLI_VALUE_MAX -> LZMA_VLI_MAX - LZMA_VLI_VALUE_UNKNOWN -> LZMA_VLI_UNKNOWN - LZMA_HEADER_ERRRO -> LZMA_OPTIONS_ERROR	2008-09-13 12:10:43 +03:00
Lasse Collin	3b34851de1	Sort of garbage collection commit. :-\| Many things are still broken. API has changed a lot and it will still change a little more here and there. The command line tool doesn't have all the required changes to reflect the API changes, so it's easy to get "internal error" or trigger assertions.	2008-08-28 22:53:15 +03:00
Lasse Collin	7d17818cec	Update the code to mostly match the new simpler file format specification. Simplify things by removing most of the support for known uncompressed size in most places. There are some miscellaneous changes here and there too. The API of liblzma has got many changes and still some more will be done soon. While most of the code has been updated, some things are not fixed (the command line tool will choke with invalid filter chain, if nothing else). Subblock filter is somewhat broken for now. It will be updated once the encoded format of the Subblock filter has been decided.	2008-06-18 18:02:10 +03:00
Lasse Collin	f9842f7127	Return LZMA_HEADER_ERROR if LZMA_SYNC_FLUSH is used with any of the so called simple filters. If there is demand, limited support for LZMA_SYNC_FLUSH may be added in future. After this commit, using LZMA_SYNC_FLUSH shouldn't cause undefined behavior in any situation.	2008-01-26 00:25:34 +02:00
Lasse Collin	b254bd97b1	Fix wrong too small size of argument unfiltered_max in ia64_coder_init(). It triggered assert() in simple_coder.c, and could have caused a buffer overflow. This error was probably a copypaste mistake, since most of the simple filters use unfiltered_max = 4.	2008-01-17 17:39:42 +02:00
Lasse Collin	3e16d51dd6	Remove uncompressed size tracking from the filter encoders. It's not strictly needed there, and just complicates the code. LZ encoder never even had this feature. The primary reason to have uncompressed size tracking in filter encoders was validating that the application doesn't give different amount of input that it had promised. A side effect was to validate internal workings of liblzma. Uncompressed size tracking is still present in the Block encoder. Maybe it should be added to LZMA_Alone and raw encoders too. It's simpler to have one coder just to validate the uncompressed size instead of having it in every filter.	2007-12-11 16:49:19 +02:00
Lasse Collin	5d018dc035	Imported to git.	2007-12-09 00:42:33 +02:00

37 Commits