Commit Graph

1250 Commits

Author SHA1 Message Date
Lasse Collin cae9a5e0bf xz: Use stricter pledge(2) and Landlock sandbox.
This makes these sandboxing methods stricter when no files are
created or deleted. That is, it's a middle ground between the
initial sandbox and the strictest single-file-to-stdout sandbox:
this allows opening files for reading but output has to go to stdout.
2024-02-17 23:07:35 +02:00
Lasse Collin 02e3505991 xz: Support Landlock ABI version 4.
Linux 6.7 added support for ABI version 4 which restricts
TCP connections which xz won't need and thus those can be
forbidden now. Since the ABI version is handled at runtime,
supporting version 4 won't cause any compatibility issues.

Note that new enough kernel headers are required to get
version 4 support enabled at build time.
2024-02-17 23:07:35 +02:00
Lasse Collin 374868d81d xz: Move sandboxing code to sandbox.c and improve Landlock sandbox.
Landlock is now always used just like pledge(2) is: first in more
permissive mode and later (under certain common conditions) in
a strict mode that doesn't allow opening more files.

I put pledge(2) first in sandbox.c because it's the simplest API
to use and still somewhat fine-grained for basic applications.
So it's the simplest thing to understand for anyone reading sandbox.c.
2024-02-17 23:07:35 +02:00
Lasse Collin 7312dfbb02 xz: Tweak comments. 2024-02-17 23:07:35 +02:00
Lasse Collin c701a5909a xz: Fix message_init() description.
Also explicitly initialize progress_automatic to make it clear
that it can be read before message_init() sets it. Static variable
was initialized to false by default already so this is only for
clarity.
2024-02-17 23:07:35 +02:00
Lasse Collin 56246607df Build: Install translated lzmainfo man pages.
All other translated man pages were being installed but
lzmainfo had been forgotten.
2024-02-17 16:23:14 +02:00
Lasse Collin f1d6b88aef liblzma: Avoid implementation-defined behavior in the RISC-V filter.
GCC docs promise that it works and a few other compilers do
too. Clang/LLVM is documented source code only but unsurprisingly
it behaves the same as others on x86-64 at least. But the
certainly-portable way is good enough here so use that.
2024-02-17 16:01:32 +02:00
Lasse Collin 843ddc5f61 liblzma: Wrap a line exceeding 80 chars. 2024-02-17 15:50:21 +02:00
Sebastian Andrzej Siewior e9053c9072 liblzma/rangecoder: Exclude x32 from the x86-64 optimisation.
The x32 port has a x86-64 ABI in term of all registers but uses only
32bit pointer like x86-32. The assembly optimisation fails to compile on
x32. Given the state of x32 I suggest to exclude it from the
optimisation rather than trying to fix it.

Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
2024-02-17 15:50:21 +02:00
Jia Tan fb5f6aaf18 Fix typos discovered by codespell. 2024-02-16 22:54:59 +08:00
Jia Tan 6f1790254a Bump version for 5.5.2beta. 2024-02-15 01:53:40 +08:00
Lasse Collin 924fdeedf4 liblzma: Fix validate_map.sh.
Adding the SPDX license identifier changed the line numbers.
2024-02-14 19:46:11 +02:00
Lasse Collin a4557bad96 liblzma: Silence warnings in --enable-small build. 2024-02-14 19:21:45 +02:00
Lasse Collin 160b686264 liblzma: Silence a warning. 2024-02-14 19:05:58 +02:00
Lasse Collin 8af7db854f xz: Mention lzmainfo if trying to use 'lzma --list'.
This kind of fixes the problem reported here:
https://bugs.launchpad.net/ubuntu/+source/xz-utils/+bug/1291020
2024-02-14 18:31:16 +02:00
Lasse Collin 0668907ff7 liblzma: Add comments. 2024-02-14 18:31:16 +02:00
Lasse Collin 109f1913d4 Scripts: Add lz4 support to xzgrep and xzdiff. 2024-02-14 18:31:16 +02:00
Lasse Collin de55485cb2 liblzma: Choose the range decoder variants using a bitmask macro. 2024-02-14 18:31:16 +02:00
Lasse Collin 0709c2b2d7 xz: Fix outdated threading related info on the man page. 2024-02-14 18:31:16 +02:00
Lasse Collin 3182a330c1 liblzma: Range decoder: Add x86-64 inline assembly.
It's compatible with GCC and Clang.
2024-02-14 18:31:16 +02:00
Lasse Collin cba2edc991 liblzma: Range decoder: Add branchless C code.
It's used only for basic bittrees and fixed-size reverse bittree
because those showed a clear benefit on x86-64 with GCC and Clang.
The other methods were more mixed and thus are commented out but
they should be tested on other archs.
2024-02-14 18:31:16 +02:00
Lasse Collin e290a72d6d liblzma: Clarify a comment. 2024-02-14 18:31:16 +02:00
Lasse Collin 5e04706b91 liblzma: LZMA decoder: Optimize loop comparison.
But now it needs one more local variable.
2024-02-14 18:31:16 +02:00
Lasse Collin 88276f9f2c liblzma: Optimize literal_subcoder() macro slightly. 2024-02-14 18:31:16 +02:00
Lasse Collin 5938f6de4d liblzma: LZ decoder: Add unlikely(). 2024-02-14 18:31:16 +02:00
Lasse Collin 9c252e3ed0 liblzma: LZ decoder: Remove a useless unlikely(). 2024-02-14 18:31:16 +02:00
Lasse Collin f3872a5947 liblzma: Optimize LZ decoder slightly.
Now extra buffer space is reserved so that repeating bytes for
any single match will never need to copy from two places (both
the beginning and the end of the buffer). This simplifies
dict_repeat() and helps a little with speed.

This seems to reduce .lzma decompression time about 2 %, so
with .xz and CRC it could be slightly less. The small things
add up still.
2024-02-14 18:31:16 +02:00
Lasse Collin eb518446e5 liblzma: LZMA decoder: Get rid of next_state[].
It's not completely obvious if this is better in the decoder.
It should be good if compiler can avoid creating a branch
(like using CMOV on x86).

This also makes lzma_encoder.c use the new macros.
2024-02-14 18:31:16 +02:00
Lasse Collin e0c0ee475c liblzma: LZMA decoder improvements.
This adds macros for bittree decoding which prepares the code
for alternative C versions and inline assembly.
2024-02-14 18:31:16 +02:00
Jia Tan de5c5e4176 liblzma: Creates Non-resumable and Resumable modes for lzma_decoder.
The new decoder resumes the first decoder loop in the Resumable mode.
Then, the code executes in Non-resumable mode until it detects that it
cannot guarantee to have enough input/output to decode another symbol.

The Resumable mode is how the decoder has always worked. Before decoding
every input bit, it checks if there is enough space and will save its
location to be resumed later. When the decoder has more input/output,
it jumps back to the correct sequence in the Resumable mode code.

When the input/output buffers are large, the Resumable mode is much
slower than the Non-resumable because it has more branches and is harder
for the compiler to optimize since it is in a large switch block.

Early benchmarking shows significant time improvement (8-10% on gcc and
clang x86) by using the Non-resumable code as much as possible.
2024-02-14 18:31:16 +02:00
Jia Tan e446ab7a18 liblzma: Creates separate "safe" range decoder mode.
The new "safe" range decoder mode is the same as old range decoder, but
now the default behavior of the range decoder will not check if there is
enough input or output to complete the operation. When the buffers are
close to fully consumed, the "safe" operations must be used instead. This
will improve speed because it will reduce the number of branches needed
for most of the range decoder operations.
2024-02-14 18:31:16 +02:00
Lasse Collin e48287bf51 xzdiff, xzgrep, and xzmore: Rewrite the man pages.
The main reason is a kind of silly one:

xz-man.pot contains strings from all man pages in XZ Utils.
The man pages of xzdiff, xzgrep, and xzmore were under GPLv2
and the rest under 0BSD. Thus xz-man.pot contained strings
under two licences. po4a creates the translated man pages
from the combined 0BSD+GPLv2 xz-man.pot.

I haven't liked this mixing in xz-man.pot but the
Translation Project requires that all man pages must be
in the same .pot file. So a separate xz-man-gpl.pot
wasn't an option.

Since these man pages are short, rewriting them was quick enough.
Now xz-man.pot is entirely under 0BSD and marking the per-file
licenses is simpler.

As a bonus, some wording hopefully is now slightly better
although it's perhaps a matter of taste.

NOTE: In xzgrep.1, the EXIT STATUS section was written by me
in the commit d796b6d7fd so that's
why that section could be taken as is from the old xzgrep.1.
2024-02-14 18:31:16 +02:00
Lasse Collin 3e551b111b xzless: Update man page slightly.
The xz tool can decompress three file formats and xzless
has always supported uncompressed files too.
2024-02-14 18:31:16 +02:00
Lasse Collin b941549573 liblzma: Include the SPDX license identifier 0BSD to generated files.
Perhaps the generated files aren't even copyrightable but
using the same license for them as for the rest of the liblzma
keeps things more consistent for tools that look for license info.
2024-02-14 18:31:16 +02:00
Lasse Collin 8e4ec79483 liblzma: Fix compilation of price_tablegen.c.
It is built and run only manually so this didn't matter
unless one wanted to regenerate the price_table.c.
2024-02-14 18:31:16 +02:00
Lasse Collin e99bff3ffb Add SPDX license identifiers to GPL, LGPL, and FSFULLR files. 2024-02-14 18:31:16 +02:00
Lasse Collin 22af94128b Add SPDX license identifier into 0BSD source code files. 2024-02-14 18:31:16 +02:00
Lasse Collin 23de53421e liblzma: Sync the AUTHORS fix about SHA-256 to lzma.h. 2024-02-14 18:31:16 +02:00
Lasse Collin 689e0228ba Change most public domain parts to 0BSD.
Translations and doc/xz-file-format.txt and doc/lzma-file-format.txt
were not touched.

COPYING.0BSD was added.
2024-02-14 18:31:12 +02:00
Lasse Collin 76946dc433 Fix SHA-256 authors.
The initial commit 5d018dc035
in 2007 had a comment in sha256.c that the code is based on
Crypto++ Library 5.5.1. In 2009 the Authors list in sha256.c
and the AUTHORS file was updated with information that the
code had come from Crypto++ but via 7-Zip. I know I had viewed
7-Zip's SHA-256 code but back then the C code has been identical
enough with Crypto++, so I don't why I thought the author info
would need that extra step via 7-Zip for this single file.

Another error is that I had mixed sha.* and shacal2.* files
when checking for author info in Crypto++. The shacal2.* files
aren't related to liblzma's sha256.c and thus Kevin Springle's
code in Crypto++ isn't either.
2024-02-14 15:23:00 +02:00
Jia Tan 45663443eb liblzma: Fix build error if only RISC-V BCJ filter is enabled.
If any other BCJ filter was enabled for encoding or decoding, then this
was not a problem.
2024-02-13 23:33:21 +08:00
Lasse Collin 9860d418d2 xzless: Use ||- in LESSOPEN with with "less" 451 and newer. 2024-02-09 23:21:01 +02:00
Lasse Collin fd0692b052 xzless: Use --show-preproc-errors with "less" 632 and newer.
This makes "less" show a warning if a decompression error occurred.
2024-02-09 23:00:05 +02:00
Jia Tan adb073da76 liblzma: Fix typo discovered by codespell. 2024-02-09 23:59:54 +08:00
Jia Tan 7f68a68c19 liblzma: Update Authors list in crc32_arm64.h. 2024-02-02 01:38:51 +08:00
Jia Tan 97f9ba50b8 liblzma: Check HAVE_USABLE_CLMUL before omitting CRC32 table.
This was split from the prior commit so it could be easily applied to
the 5.4 branch.

Closes: https://github.com/tukaani-project/xz/pull/77
2024-02-01 20:09:11 +08:00
Jia Tan ca9015f4de liblzma: Check HAVE_USABLE_CLMUL before omitting CRC64 table.
If liblzma is configured with --disable-clmul-crc
CFLAGS="-msse4.1 -mpclmul", then it will fail to compile because the
generic version must be used but the CRC tables were not included.
2024-02-01 20:09:11 +08:00
Jia Tan 2f1552a91c liblzma: Only use ifunc in crcXX_fast.c if its needed.
The code was using HAVE_FUNC_ATTRIBUTE_IFUNC instead of CRC_USE_IFUNC.
With ARM64, ifunc is incompatible because it requires non-inline
function calls for runtime detection.
2024-02-01 20:09:11 +08:00
Jia Tan 1940f0ec28 liblzma: Omit CRC tables when not needed with ARM64 optimizations.
This is similar to the existing x86-64 CLMUL conditions to omit the
tables. They were slightly refactored to improve readability.
2024-02-01 20:09:11 +08:00
Jia Tan 761f5b69a4 liblzma: Rename crc32_aarch64.h to crc32_arm64.h.
Even though the proper name for the architecture is aarch64, this
project uses ARM64 throughout. So the rename is for consistency.

Additionally, crc32_arm64.h was slightly refactored for the following
changes:

   * Added MSVC, FreeBSD, and macOS support in
     is_arch_extension_supported().

   * crc32_arch_optimized() now checks the size when aligning the
     buffer.

   * crc32_arch_optimized() loop conditions were slightly modified to
     avoid both decrementing the size and incrementing the buffer
     pointer.

   * Use the intrinsic wrappers defined in <arm_acle.h> because GCC and
     Clang name them differently.

   * Minor spacing and comment changes.
2024-02-01 20:09:11 +08:00