Commit Graph

1321 Commits

Author SHA1 Message Date
Lasse Collin 09cabae2ab xz: Fix grammar in a comment
Fixes: cb3111e3ed
(cherry picked from commit b90339f4da)
2024-05-23 11:28:20 +03:00
Lasse Collin c10b66fbf9 xz: Rename filter_memusages to encoder_memusages
(cherry picked from commit 4c0bdaf13d)
2024-05-23 11:28:20 +03:00
Lasse Collin 9132ce3564 xz: Edit coding style
(cherry picked from commit b54aa023e0)
2024-05-23 11:28:20 +03:00
Lasse Collin d642e13874 xz: Rename filters_index to chain_num
The reason is the same as in bd0782c1f13e52cd0fd8415208e30e47004a4c68.

(cherry picked from commit 49f67d3d3f)
2024-05-23 11:28:20 +03:00
Lasse Collin 47599f3b73 xz: Replace a few uint32_t with "unsigned" to reduce the number of casts
These hold only tiny values.

(cherry picked from commit ff9e8b3d06)
2024-05-23 11:28:20 +03:00
Lasse Collin 8f5ab75c45 xz: Rename filters_used_mask to chains_used_mask
The reason is the same as in bd0782c1f13e52cd0fd8415208e30e47004a4c68.

(cherry picked from commit b5e6c1113b)
2024-05-23 11:28:20 +03:00
Lasse Collin 3eb7cf9dd5 xz: Move the setting of "check" in coder_set_compression_settings()
It's more logical to do it in the beginning instead of in the middle
of the filter chain handling.

Fixes: d6af7f3470
(cherry picked from commit 32500dfaad)
2024-05-23 11:28:20 +03:00
Lasse Collin 067961ee0e xz: Rename "filters" to "chains"
The convention is that

    lzma_filter filters[LZMA_FILTERS_MAX + 1];

contains the filters of a single filter chain.
It was so here as well before the commit
d6af7f3470.
It changes "filters" to a ten-element array of filter chains.
It's clearer to call this array-of-arrays "chains".

This also renames "filter_idx" to "chain_idx" which is used
as an index as in chains[chain_idx].

(cherry picked from commit ad146b1f42)
2024-05-23 11:28:20 +03:00
Lasse Collin 6822f6f891 xz: Clean up a comment
(cherry picked from commit 5a4ae4e4d0)
2024-05-23 11:28:20 +03:00
Lasse Collin 0e5e3e7bdc xz: Add clarifying assertions
(cherry picked from commit 2de80494ed)
2024-05-23 11:28:20 +03:00
Lasse Collin 77bcf6b76a xz: Add a clarifying assertion
Fixes: 5f0c5a0438
(cherry picked from commit 1eaad004bf)
2024-05-23 11:28:20 +03:00
Lasse Collin df3efc058a xz: Clarify a comment
(cherry picked from commit 605094329b)
2024-05-23 11:28:20 +03:00
Lasse Collin 4ebfe11cd3 xz: Use the info collected in parse_block_list()
This is slightly simpler and it avoids looping through
the opt_block_list array.

(cherry picked from commit 8fac2577f2)
2024-05-23 11:28:20 +03:00
Lasse Collin bfea691361 xz: Remember the filter chains and the largest Block in parse_block_list()
(cherry picked from commit 81d350dab8)
2024-05-23 11:28:20 +03:00
Lasse Collin d4e33e7392 xz: Update a comment and initialization of filters_used_mask
(cherry picked from commit 46ab56968f)
2024-05-23 11:28:20 +03:00
Lasse Collin 3c130737c9 xz: parse_block_list: Edit integer type casting
(cherry picked from commit e89293a0ba)
2024-05-23 11:28:20 +03:00
Lasse Collin 40c8513b4e xz: Make filter_memusages a local variable
(cherry picked from commit 87011e40c1)
2024-05-23 11:28:20 +03:00
Lasse Collin cacaf25aa7 xz: Remove unused code and simplify
opt_mode == MODE_COMPRESS isn't possible when HAVE_ENCODERS isn't
defined. Thus, when *encoding*, the message about *decoder* memory
usage is possible to show only when both encoder and decoder have
been built.

Since the message is shown only at V_DEBUG, skip the memusage
calculation if verbosity level isn't high enough.

Fixes: 5f0c5a0438
(cherry picked from commit 347b412a93)
2024-05-23 11:28:20 +03:00
Lasse Collin 3495a6b291 xz: Fix integer type from uint64_t to uint32_t
lzma_options_lzma.dict_size is uint32_t so use it here too.

Fixes: 5f0c5a0438
(cherry picked from commit 31358c057c)
2024-05-23 11:28:20 +03:00
Lasse Collin 1b4e7dca24 xz: Edit comments and coding style
(cherry picked from commit e4780244a1)
2024-05-23 11:28:20 +03:00
Lasse Collin 18683525a7 xz: Omit an incorrect comment
It likely was a leftover from a development version of the code.

Fixes: 183819bfd9
(cherry picked from commit fe4d8b0c80)
2024-05-23 11:28:20 +03:00
Lasse Collin 005f039864 xz: Add braces to a for-statement and to an if-statement
No functional changes.

Fixes: 5f0c5a0438
Fixes: 479fd58d60
(cherry picked from commit 9bef5b8d17)
2024-05-23 11:28:20 +03:00
Lasse Collin 34be4e6aa6 liblzma: Omit an unneeded array from the x86 filter
Fixes: 6aa2a6deeb
(cherry picked from commit de06b9f0c0)
2024-05-23 11:28:20 +03:00
Lasse Collin f99e7c69ad xzdec: Support Landlock ABI version 4
This was added to xz in 02e3505991
but I forgot to do the same in xzdec.

The Landlock sandbox in xzdec could be stricter as now it's
active only for the last file being decompressed. In xz,
read-only sandbox is used for multi-file case. On the other hand,
xz doesn't go to the strictest mode when processing the last file
when more than one file was specified; xzdec does.

(cherry picked from commit 3334c71d3d)
2024-05-23 00:13:43 +03:00
Lasse Collin bfe9be7a46 liblzma: Fix incorrect function type error from sanitizer
Clang 17 with -fsanitize=address,undefined:

    src/liblzma/common/filter_common.c:366:8: runtime error:
        call to function encoder_find through pointer to incorrect
        function type 'const lzma_filter_coder *(*)(unsigned long)'
    src/liblzma/common/filter_encoder.c:187: note:
        encoder_find defined here

Use a wrapper function to get the correct type neatly.
This reduces the number of casts needed too.

This issue could be a problem with control flow integrity (CFI)
methods that check the function type on indirect function calls.

Fixes: 3b34851de1
(cherry picked from commit 278563ef8f)
2024-05-23 00:13:43 +03:00
Lasse Collin 882eadc5b8 xz: Avoid arithmetic on a null pointer
It's undefined behavior. The result wasn't ever used as it occurred
in the last iteration of a loop.

Clang 17 with -fsanitize=address,undefined:

    $ src/xz/xz --block-list=123
    src/xz/args.c:164:12: runtime error: applying non-zero offset 1
        to null pointer

Fixes: 88ccf47205
Co-authored-by: Sam James <sam@gentoo.org>
(cherry picked from commit 77c8f60547)
2024-05-23 00:13:43 +03:00
Lasse Collin 28e7d130cb Build: Add --enable-doxygen to generate and install API docs
It requires Doxygen. This option is disabled by default.

(cherry picked from commit e21efdf96f)
2024-05-23 00:13:43 +03:00
Lasse Collin bae288ea6f liblzma: index_decoder: Fix missing initializations on LZMA_PROG_ERROR
If the arguments to lzma_index_decoder() or lzma_index_buffer_decode()
were such that LZMA_PROG_ERROR was returned, the lzma_index **i
argument wasn't touched even though the API docs say that *i = NULL
is done if an error occurs. This obviously won't be done even now
if i == NULL but otherwise it is best to do it due to the wording
in the API docs.

In practice this matters very little: The problem can occur only
if the functions are called with invalid arguments, that is,
the calling application must already have a bug.

(cherry picked from commit 71eed2520e)
2024-05-22 14:32:36 +03:00
Sam James 493bc57c33 liblzma: outqueue: add header guard
Reported by github's codeql.

(cherry picked from commit c7ef767c49)
2024-05-22 14:32:36 +03:00
Sam James cede418d4f liblzma: easy_preset: add header guard
Reported by github's codeql.

(cherry picked from commit 55dcae3056)
2024-05-22 14:32:36 +03:00
Lasse Collin 6e76a25df2 tuklib_integer: Rename bswapXX to byteswapXX
The __builtin_bswapXX from GCC and Clang are preferred when
they are available. This can allow compilers to emit the x86 MOVBE
instruction instead of doing a load + byteswap as two instructions
(which would happen if the byteswapping is done in inline asm).

bswap16, bswap32, and bswap64 exist in system headers on *BSDs
and Darwin. #defining bswap16 on NetBSD results in a warning about
macro redefinition. It's safest to avoid this namespace conflict
completely.

No OS supported by tuklib_integer.h uses byteswapXX names and
a web search doesn't immediately find any obvious danger of
namespace conflicts. So let's try these still-pretty-short names
for the macros.

Thanks to Sam James for pointing out the compiler warning on
NetBSD 10.0.

(cherry picked from commit 4ffc60f323)
2024-05-22 14:32:36 +03:00
Lasse Collin 0ca14871f3 liblzma: API doc cleanups
(cherry picked from commit 08ab0966a7)
2024-05-22 14:32:36 +03:00
Lasse Collin 3ba3ef57f9 liblzma: lzma_str_to_filters: Set *error_pos on all errors
The API docs clearly say that if error_pos isn't NULL then *error
is always set on any error. However, it wasn't touched if str == NULL
or filters == NULL or unsupported flags were specified.

Fixes: cedeeca2ea
(cherry picked from commit 70d12dd069)
2024-05-22 14:32:36 +03:00
Lasse Collin 57ad820e15 liblzma: Clean up white space
(cherry picked from commit ed8e552395)
2024-05-22 14:32:36 +03:00
Lasse Collin 6e210d5766 liblzma: Silence a warning from Coverity static analysis
It is logical why it cannot know for sure that the value has
to be at most 4 if it is less than 16.

The x86 filter is based on a very old LZMA SDK version. Newer
ones have quite a different implementation for the same filter.

Thanks to Sam James.

(cherry picked from commit 6aa2a6deeb)
2024-05-22 14:32:36 +03:00
Lasse Collin 7413383e42 xz: Fix white space error.
Thanks to xx on #tukaani.

(cherry picked from commit eeca8f7c5b)
2024-05-22 14:32:36 +03:00
Sam James eed2f26c0e xz: add missing noreturn for message_filters_help
Fixes: a165d7df19
(cherry picked from commit 462ca94099)
2024-05-22 14:32:36 +03:00
Sam James 2633d8df61 xz: signals: suppress -Wsign-conversion on macOS
On macOS, we get:
```
signals.c: In function 'signals_init':
signals.c:76:17: error: conversion to 'sigset_t' {aka 'unsigned int'} from 'int' may change the sign of the result [-Werror=sign-conversion]
   76 |                 sigaddset(&hooked_signals, sigs[i]);
      |                 ^~~~~~~~~
signals.c:81:17: error: conversion to 'sigset_t' {aka 'unsigned int'} from 'int' may change the sign of the result [-Werror=sign-conversion]
   81 |                 sigaddset(&hooked_signals, message_progress_sigs[i]);
      |                 ^~~~~~~~~
signals.c:86:9: error: conversion to 'sigset_t' {aka 'unsigned int'} from 'int' may change the sign of the result [-Werror=sign-conversion]
   86 |         sigaddset(&hooked_signals, SIGTSTP);
      |         ^~~~~~~~~
```

We use `int` for `hooked_signals` but we can't just cast to whatever
`sigset_t` is because `sigset_t` is an opaque type. It's an unsigned int
on macOS. On macOS, `sigaddset` is implemented as a macro.

Just suppress -Wsign-conversion for `signals_init` for macOS given
there's no real nice way of fixing this.

(cherry picked from commit 863f13d282)
2024-05-22 14:32:36 +03:00
Lasse Collin 5d20a61205 liblzma: CRC: Simplify table omission macros
A macro is useful to prevent a single #if directive from
getting too ugly but only one macro is needed for all archs.

(cherry picked from commit 6286c1900c)
2024-05-22 14:26:03 +03:00
Lasse Collin 2a80827e23 liblzma: ARM64 CRC: Fix omission of CRC32 table
The macro name had an odd typo so the table wasn't omitted
when it should have.

Fixes: 1940f0ec28
(cherry picked from commit 45da936c87)
2024-05-22 14:26:03 +03:00
Lasse Collin 9223ad6e78 liblzma: ARM64 CRC32: Change style of the macOS code to match FreeBSD
I didn't test this but it shouldn't change any functionality.

Fixes: 761f5b69a4
(cherry picked from commit fc43cecd32)
2024-05-22 14:26:03 +03:00
Lasse Collin 32ceb2c36a liblzma: ARM64 CRC32: Add error checking to FreeBSD-specific code
Also add parenthesis to the return statement.

I didn't test this.

Fixes: 761f5b69a4
(cherry picked from commit 1024cd4cd9)
2024-05-22 14:26:03 +03:00
Lasse Collin 42915101e9 liblzma: ARM64 CRC32: Use negation instead of subtracting from 8
Subtracting from 0 is negation, this just keeps warnings away.

Fixes: 761f5b69a4
(cherry picked from commit 2337f7021c)
2024-05-22 14:26:03 +03:00
Lasse Collin 42a9482b48 liblzma: ARM64 CRC32: Tweak coding style and comments
(cherry picked from commit d8fffd01aa)
2024-05-22 14:26:03 +03:00
Lasse Collin 34d1252f09 liblzma: Remove ifunc support.
This is *NOT* done for security reasons even though the backdoor
relied on the ifunc code. Instead, the reason is that in this
project ifunc provides little benefits but it's quite a bit of
extra code to support it. The only case where ifunc *might* matter
for performance is if the CRC functions are used directly by an
application. In normal compression use it's completely irrelevant.

(cherry picked from commit 689ae24273)
2024-05-22 14:12:43 +03:00
Lasse Collin 1a1f3d0323 xz man page: Use .ft CR instead of CW to silence warnings from groff.
(cherry picked from commit 31ef676567)
2024-05-22 14:12:43 +03:00
Lasse Collin 879295d91f Update maintainer and author info.
The other maintainer suddenly disappeared.

(cherry picked from commit 77a294d98a)
2024-05-22 14:12:43 +03:00
Lasse Collin eeb74fba1f Update website URLs back to tukaani.org.
The XZ projects were moved back to their original URLs.

(cherry picked from commit 17aa2e1a79)
2024-05-22 14:12:39 +03:00
Lasse Collin a7b9cd7000 xzdec: Tweak coding style and comments.
(cherry picked from commit 2739db9810)
2024-05-22 14:12:13 +03:00
Lasse Collin b3a7561880 liblzma: memcmplen.h: Add a comment why subtraction is used.
(cherry picked from commit 0b99783d63)
2024-05-22 14:07:37 +03:00