Commit Graph

1331 Commits

Author SHA1 Message Date
Lasse Collin 8fda5ce872 Fix typos
Thanks to xx on #tukaani.

(cherry picked from commit 4e9023857d)
2024-05-23 11:36:05 +03:00
Lasse Collin 2729079bcb liblzma: Fix white space
Thanks to xx on #tukaani.

(cherry picked from commit b14d08fbbc)
2024-05-23 11:36:05 +03:00
Lasse Collin a289c4dfeb xz: Document the static function get_chains_memusage()
(cherry picked from commit 142e670a41)
2024-05-23 11:28:20 +03:00
Lasse Collin 6f0db31713 xz: Rename filters_memusage_max() to get_chains_memusage()
(cherry picked from commit 78e984399a)
2024-05-23 11:28:20 +03:00
Lasse Collin d7e2bf7e2d xz: Rename filter_memusages to chains_memusages
(cherry picked from commit 54c3db0a83)
2024-05-23 11:28:20 +03:00
Lasse Collin 58f200b6d1 xz: Simplify the memory usage scaling code
This is closer to what it was before the --filtersX support was added,
just extended to support for scaling all filter chains. The method
before this commit was an extended version of the original too but
it was done in a more complex way for no clear reason. In case of
an error, the complex version printed fewer informative messages
(a good thing) but it's not a sigificant benefit.

In the limit is too low even for single-threaded mode, the required
amount of memory is now reported like in 5.4.x instead of like in
5.5.1alpha - 5.6.1 which showed the original non-scaled usage. It
had been a FIXME in the old code but it's not clear what message
makes the most sense.

Fixes: 5f0c5a0438
(cherry picked from commit d9e1ae79ec)
2024-05-23 11:28:20 +03:00
Lasse Collin 41bdc9fa5c xz: Edit comments
(cherry picked from commit 0ee56983d1)
2024-05-23 11:28:20 +03:00
Lasse Collin 52e40c1912 xz: Rename chain_idx to chain_num
(cherry picked from commit ec82a49c35)
2024-05-23 11:28:20 +03:00
Lasse Collin 8a01963331 xz: Edit coding style
(cherry picked from commit a731a6993c)
2024-05-23 11:28:20 +03:00
Lasse Collin e3ad7eda74 xz: Edit comments
Fixes: 5f0c5a0438
(cherry picked from commit 32eb176b89)
2024-05-23 11:28:20 +03:00
Lasse Collin 09cabae2ab xz: Fix grammar in a comment
Fixes: cb3111e3ed
(cherry picked from commit b90339f4da)
2024-05-23 11:28:20 +03:00
Lasse Collin c10b66fbf9 xz: Rename filter_memusages to encoder_memusages
(cherry picked from commit 4c0bdaf13d)
2024-05-23 11:28:20 +03:00
Lasse Collin 9132ce3564 xz: Edit coding style
(cherry picked from commit b54aa023e0)
2024-05-23 11:28:20 +03:00
Lasse Collin d642e13874 xz: Rename filters_index to chain_num
The reason is the same as in bd0782c1f13e52cd0fd8415208e30e47004a4c68.

(cherry picked from commit 49f67d3d3f)
2024-05-23 11:28:20 +03:00
Lasse Collin 47599f3b73 xz: Replace a few uint32_t with "unsigned" to reduce the number of casts
These hold only tiny values.

(cherry picked from commit ff9e8b3d06)
2024-05-23 11:28:20 +03:00
Lasse Collin 8f5ab75c45 xz: Rename filters_used_mask to chains_used_mask
The reason is the same as in bd0782c1f13e52cd0fd8415208e30e47004a4c68.

(cherry picked from commit b5e6c1113b)
2024-05-23 11:28:20 +03:00
Lasse Collin 3eb7cf9dd5 xz: Move the setting of "check" in coder_set_compression_settings()
It's more logical to do it in the beginning instead of in the middle
of the filter chain handling.

Fixes: d6af7f3470
(cherry picked from commit 32500dfaad)
2024-05-23 11:28:20 +03:00
Lasse Collin 067961ee0e xz: Rename "filters" to "chains"
The convention is that

    lzma_filter filters[LZMA_FILTERS_MAX + 1];

contains the filters of a single filter chain.
It was so here as well before the commit
d6af7f3470.
It changes "filters" to a ten-element array of filter chains.
It's clearer to call this array-of-arrays "chains".

This also renames "filter_idx" to "chain_idx" which is used
as an index as in chains[chain_idx].

(cherry picked from commit ad146b1f42)
2024-05-23 11:28:20 +03:00
Lasse Collin 6822f6f891 xz: Clean up a comment
(cherry picked from commit 5a4ae4e4d0)
2024-05-23 11:28:20 +03:00
Lasse Collin 0e5e3e7bdc xz: Add clarifying assertions
(cherry picked from commit 2de80494ed)
2024-05-23 11:28:20 +03:00
Lasse Collin 77bcf6b76a xz: Add a clarifying assertion
Fixes: 5f0c5a0438
(cherry picked from commit 1eaad004bf)
2024-05-23 11:28:20 +03:00
Lasse Collin df3efc058a xz: Clarify a comment
(cherry picked from commit 605094329b)
2024-05-23 11:28:20 +03:00
Lasse Collin 4ebfe11cd3 xz: Use the info collected in parse_block_list()
This is slightly simpler and it avoids looping through
the opt_block_list array.

(cherry picked from commit 8fac2577f2)
2024-05-23 11:28:20 +03:00
Lasse Collin bfea691361 xz: Remember the filter chains and the largest Block in parse_block_list()
(cherry picked from commit 81d350dab8)
2024-05-23 11:28:20 +03:00
Lasse Collin d4e33e7392 xz: Update a comment and initialization of filters_used_mask
(cherry picked from commit 46ab56968f)
2024-05-23 11:28:20 +03:00
Lasse Collin 3c130737c9 xz: parse_block_list: Edit integer type casting
(cherry picked from commit e89293a0ba)
2024-05-23 11:28:20 +03:00
Lasse Collin 40c8513b4e xz: Make filter_memusages a local variable
(cherry picked from commit 87011e40c1)
2024-05-23 11:28:20 +03:00
Lasse Collin cacaf25aa7 xz: Remove unused code and simplify
opt_mode == MODE_COMPRESS isn't possible when HAVE_ENCODERS isn't
defined. Thus, when *encoding*, the message about *decoder* memory
usage is possible to show only when both encoder and decoder have
been built.

Since the message is shown only at V_DEBUG, skip the memusage
calculation if verbosity level isn't high enough.

Fixes: 5f0c5a0438
(cherry picked from commit 347b412a93)
2024-05-23 11:28:20 +03:00
Lasse Collin 3495a6b291 xz: Fix integer type from uint64_t to uint32_t
lzma_options_lzma.dict_size is uint32_t so use it here too.

Fixes: 5f0c5a0438
(cherry picked from commit 31358c057c)
2024-05-23 11:28:20 +03:00
Lasse Collin 1b4e7dca24 xz: Edit comments and coding style
(cherry picked from commit e4780244a1)
2024-05-23 11:28:20 +03:00
Lasse Collin 18683525a7 xz: Omit an incorrect comment
It likely was a leftover from a development version of the code.

Fixes: 183819bfd9
(cherry picked from commit fe4d8b0c80)
2024-05-23 11:28:20 +03:00
Lasse Collin 005f039864 xz: Add braces to a for-statement and to an if-statement
No functional changes.

Fixes: 5f0c5a0438
Fixes: 479fd58d60
(cherry picked from commit 9bef5b8d17)
2024-05-23 11:28:20 +03:00
Lasse Collin 34be4e6aa6 liblzma: Omit an unneeded array from the x86 filter
Fixes: 6aa2a6deeb
(cherry picked from commit de06b9f0c0)
2024-05-23 11:28:20 +03:00
Lasse Collin f99e7c69ad xzdec: Support Landlock ABI version 4
This was added to xz in 02e3505991
but I forgot to do the same in xzdec.

The Landlock sandbox in xzdec could be stricter as now it's
active only for the last file being decompressed. In xz,
read-only sandbox is used for multi-file case. On the other hand,
xz doesn't go to the strictest mode when processing the last file
when more than one file was specified; xzdec does.

(cherry picked from commit 3334c71d3d)
2024-05-23 00:13:43 +03:00
Lasse Collin bfe9be7a46 liblzma: Fix incorrect function type error from sanitizer
Clang 17 with -fsanitize=address,undefined:

    src/liblzma/common/filter_common.c:366:8: runtime error:
        call to function encoder_find through pointer to incorrect
        function type 'const lzma_filter_coder *(*)(unsigned long)'
    src/liblzma/common/filter_encoder.c:187: note:
        encoder_find defined here

Use a wrapper function to get the correct type neatly.
This reduces the number of casts needed too.

This issue could be a problem with control flow integrity (CFI)
methods that check the function type on indirect function calls.

Fixes: 3b34851de1
(cherry picked from commit 278563ef8f)
2024-05-23 00:13:43 +03:00
Lasse Collin 882eadc5b8 xz: Avoid arithmetic on a null pointer
It's undefined behavior. The result wasn't ever used as it occurred
in the last iteration of a loop.

Clang 17 with -fsanitize=address,undefined:

    $ src/xz/xz --block-list=123
    src/xz/args.c:164:12: runtime error: applying non-zero offset 1
        to null pointer

Fixes: 88ccf47205
Co-authored-by: Sam James <sam@gentoo.org>
(cherry picked from commit 77c8f60547)
2024-05-23 00:13:43 +03:00
Lasse Collin 28e7d130cb Build: Add --enable-doxygen to generate and install API docs
It requires Doxygen. This option is disabled by default.

(cherry picked from commit e21efdf96f)
2024-05-23 00:13:43 +03:00
Lasse Collin bae288ea6f liblzma: index_decoder: Fix missing initializations on LZMA_PROG_ERROR
If the arguments to lzma_index_decoder() or lzma_index_buffer_decode()
were such that LZMA_PROG_ERROR was returned, the lzma_index **i
argument wasn't touched even though the API docs say that *i = NULL
is done if an error occurs. This obviously won't be done even now
if i == NULL but otherwise it is best to do it due to the wording
in the API docs.

In practice this matters very little: The problem can occur only
if the functions are called with invalid arguments, that is,
the calling application must already have a bug.

(cherry picked from commit 71eed2520e)
2024-05-22 14:32:36 +03:00
Sam James 493bc57c33 liblzma: outqueue: add header guard
Reported by github's codeql.

(cherry picked from commit c7ef767c49)
2024-05-22 14:32:36 +03:00
Sam James cede418d4f liblzma: easy_preset: add header guard
Reported by github's codeql.

(cherry picked from commit 55dcae3056)
2024-05-22 14:32:36 +03:00
Lasse Collin 6e76a25df2 tuklib_integer: Rename bswapXX to byteswapXX
The __builtin_bswapXX from GCC and Clang are preferred when
they are available. This can allow compilers to emit the x86 MOVBE
instruction instead of doing a load + byteswap as two instructions
(which would happen if the byteswapping is done in inline asm).

bswap16, bswap32, and bswap64 exist in system headers on *BSDs
and Darwin. #defining bswap16 on NetBSD results in a warning about
macro redefinition. It's safest to avoid this namespace conflict
completely.

No OS supported by tuklib_integer.h uses byteswapXX names and
a web search doesn't immediately find any obvious danger of
namespace conflicts. So let's try these still-pretty-short names
for the macros.

Thanks to Sam James for pointing out the compiler warning on
NetBSD 10.0.

(cherry picked from commit 4ffc60f323)
2024-05-22 14:32:36 +03:00
Lasse Collin 0ca14871f3 liblzma: API doc cleanups
(cherry picked from commit 08ab0966a7)
2024-05-22 14:32:36 +03:00
Lasse Collin 3ba3ef57f9 liblzma: lzma_str_to_filters: Set *error_pos on all errors
The API docs clearly say that if error_pos isn't NULL then *error
is always set on any error. However, it wasn't touched if str == NULL
or filters == NULL or unsupported flags were specified.

Fixes: cedeeca2ea
(cherry picked from commit 70d12dd069)
2024-05-22 14:32:36 +03:00
Lasse Collin 57ad820e15 liblzma: Clean up white space
(cherry picked from commit ed8e552395)
2024-05-22 14:32:36 +03:00
Lasse Collin 6e210d5766 liblzma: Silence a warning from Coverity static analysis
It is logical why it cannot know for sure that the value has
to be at most 4 if it is less than 16.

The x86 filter is based on a very old LZMA SDK version. Newer
ones have quite a different implementation for the same filter.

Thanks to Sam James.

(cherry picked from commit 6aa2a6deeb)
2024-05-22 14:32:36 +03:00
Lasse Collin 7413383e42 xz: Fix white space error.
Thanks to xx on #tukaani.

(cherry picked from commit eeca8f7c5b)
2024-05-22 14:32:36 +03:00
Sam James eed2f26c0e xz: add missing noreturn for message_filters_help
Fixes: a165d7df19
(cherry picked from commit 462ca94099)
2024-05-22 14:32:36 +03:00
Sam James 2633d8df61 xz: signals: suppress -Wsign-conversion on macOS
On macOS, we get:
```
signals.c: In function 'signals_init':
signals.c:76:17: error: conversion to 'sigset_t' {aka 'unsigned int'} from 'int' may change the sign of the result [-Werror=sign-conversion]
   76 |                 sigaddset(&hooked_signals, sigs[i]);
      |                 ^~~~~~~~~
signals.c:81:17: error: conversion to 'sigset_t' {aka 'unsigned int'} from 'int' may change the sign of the result [-Werror=sign-conversion]
   81 |                 sigaddset(&hooked_signals, message_progress_sigs[i]);
      |                 ^~~~~~~~~
signals.c:86:9: error: conversion to 'sigset_t' {aka 'unsigned int'} from 'int' may change the sign of the result [-Werror=sign-conversion]
   86 |         sigaddset(&hooked_signals, SIGTSTP);
      |         ^~~~~~~~~
```

We use `int` for `hooked_signals` but we can't just cast to whatever
`sigset_t` is because `sigset_t` is an opaque type. It's an unsigned int
on macOS. On macOS, `sigaddset` is implemented as a macro.

Just suppress -Wsign-conversion for `signals_init` for macOS given
there's no real nice way of fixing this.

(cherry picked from commit 863f13d282)
2024-05-22 14:32:36 +03:00
Lasse Collin 5d20a61205 liblzma: CRC: Simplify table omission macros
A macro is useful to prevent a single #if directive from
getting too ugly but only one macro is needed for all archs.

(cherry picked from commit 6286c1900c)
2024-05-22 14:26:03 +03:00
Lasse Collin 2a80827e23 liblzma: ARM64 CRC: Fix omission of CRC32 table
The macro name had an odd typo so the table wasn't omitted
when it should have.

Fixes: 1940f0ec28
(cherry picked from commit 45da936c87)
2024-05-22 14:26:03 +03:00