Commit Graph

1339 Commits

Author SHA1 Message Date
Lasse Collin c7164b1927 xz: Fix white space 2024-06-11 22:42:26 +03:00
Lasse Collin 0a32d2072c liblzma: Fix a typo in a comment
Thanks to Sam James for spotting it.

Fixes: f644473a21
2024-06-11 22:42:04 +03:00
Lasse Collin afd9b4d282 liblzma: Fix a comment indentation 2024-06-10 23:19:27 +03:00
Lasse Collin 50e6bff274 liblzma: Fix white space 2024-06-10 23:19:27 +03:00
Lasse Collin caea7844d3 tuklib: __STDC_VERSION__ in C23 is 202311 2024-06-10 23:19:27 +03:00
RainRat 9e73918a4f Fix typos
Closes: https://github.com/tukaani-project/xz/pull/124
2024-06-07 16:01:27 +03:00
Lasse Collin 04b23addf3 tuklib_integer: Fix building on OpenBSD/sparc64 that uses GCC 4.2
GCC 4.2 doesn't have __builtin_bswap16() and friends so tuklib_integer.h
tries to use OS-specific byte swap methods instead. On OpenBSD those
macros are swap16/32/64 instead of bswap16/32/64 like on other *BSDs
and Darwin.

An alternative to "#ifdef __OpenBSD__" could be "#ifdef swap16" as it
is a macro. But since OpenBSD seems to be a special case under this
special case of "*BSDs and Darwin", checking for __OpenBSD__ seems
the more conservative choice now.

Thanks to Christian Weisgerber and Brad Smith who both submitted
the same patch a few hours apart.

Co-authored-by: Christian Weisgerber <naddy@mips.inka.de>
Co-authored-by: Brad Smith <brad@comstyle.com>
Closes: https://github.com/tukaani-project/xz/pull/126
2024-06-07 15:47:20 +03:00
Lasse Collin dc03f6290f liblzma: Add ARM64 CRC32 instruction support detection on OpenBSD
The C code is from Christian Weisgerber, I merely reordered the OSes.
Then I added the build system checks without testing them.

Also thanks to Brad Smith who submitted a similar patch on GitHub
a few hours after Christian had sent his via email.

Co-authored-by: Christian Weisgerber <naddy@mips.inka.de>
Closes: https://github.com/tukaani-project/xz/pull/125
2024-06-07 15:06:59 +03:00
Sam James b69768c8bd xz: list: suppress -Wformat-nonliteral for Solaris
Solaris' GCC can't understand that our use is fine, unlike modern compilers:
```
list.c: In function 'print_totals_basic':
list.c:1191:4: error: format not a string literal, argument types not checked [-Werror=format-nonliteral]
  uint64_to_str(totals.files, 0));
  ^~~~~~~~~~~~~
cc1: all warnings being treated as errors
```

It's presumably because of older gettext missing format attributes.

This is with `gcc (GCC) 7.3.0`.
2024-06-03 12:32:34 +03:00
Lasse Collin 4e9023857d Fix typos
Thanks to xx on #tukaani.
2024-05-18 00:34:07 +03:00
Lasse Collin b14d08fbbc liblzma: Fix white space
Thanks to xx on #tukaani.
2024-05-18 00:24:50 +03:00
Lasse Collin 142e670a41 xz: Document the static function get_chains_memusage() 2024-05-13 18:00:41 +03:00
Lasse Collin 78e984399a xz: Rename filters_memusage_max() to get_chains_memusage() 2024-05-13 18:00:41 +03:00
Lasse Collin 54c3db0a83 xz: Rename filter_memusages to chains_memusages 2024-05-13 18:00:41 +03:00
Lasse Collin d9e1ae79ec xz: Simplify the memory usage scaling code
This is closer to what it was before the --filtersX support was added,
just extended to support for scaling all filter chains. The method
before this commit was an extended version of the original too but
it was done in a more complex way for no clear reason. In case of
an error, the complex version printed fewer informative messages
(a good thing) but it's not a sigificant benefit.

In the limit is too low even for single-threaded mode, the required
amount of memory is now reported like in 5.4.x instead of like in
5.5.1alpha - 5.6.1 which showed the original non-scaled usage. It
had been a FIXME in the old code but it's not clear what message
makes the most sense.

Fixes: 5f0c5a0438
2024-05-13 18:00:41 +03:00
Lasse Collin 0ee56983d1 xz: Edit comments 2024-05-13 18:00:41 +03:00
Lasse Collin ec82a49c35 xz: Rename chain_idx to chain_num 2024-05-13 18:00:41 +03:00
Lasse Collin a731a6993c xz: Edit coding style 2024-05-13 18:00:41 +03:00
Lasse Collin 32eb176b89 xz: Edit comments
Fixes: 5f0c5a0438
2024-05-13 15:41:48 +03:00
Lasse Collin b90339f4da xz: Fix grammar in a comment
Fixes: cb3111e3ed
2024-05-13 15:41:48 +03:00
Lasse Collin 4c0bdaf13d xz: Rename filter_memusages to encoder_memusages 2024-05-13 15:41:46 +03:00
Lasse Collin b54aa023e0 xz: Edit coding style 2024-05-13 15:41:05 +03:00
Lasse Collin 49f67d3d3f xz: Rename filters_index to chain_num
The reason is the same as in bd0782c1f13e52cd0fd8415208e30e47004a4c68.
2024-05-13 15:41:05 +03:00
Lasse Collin ff9e8b3d06 xz: Replace a few uint32_t with "unsigned" to reduce the number of casts
These hold only tiny values.
2024-05-13 15:41:05 +03:00
Lasse Collin b5e6c1113b xz: Rename filters_used_mask to chains_used_mask
The reason is the same as in bd0782c1f13e52cd0fd8415208e30e47004a4c68.
2024-05-13 15:41:05 +03:00
Lasse Collin 32500dfaad xz: Move the setting of "check" in coder_set_compression_settings()
It's more logical to do it in the beginning instead of in the middle
of the filter chain handling.

Fixes: d6af7f3470
2024-05-13 15:41:05 +03:00
Lasse Collin ad146b1f42 xz: Rename "filters" to "chains"
The convention is that

    lzma_filter filters[LZMA_FILTERS_MAX + 1];

contains the filters of a single filter chain.
It was so here as well before the commit
d6af7f3470.
It changes "filters" to a ten-element array of filter chains.
It's clearer to call this array-of-arrays "chains".

This also renames "filter_idx" to "chain_idx" which is used
as an index as in chains[chain_idx].
2024-05-13 15:40:58 +03:00
Lasse Collin 5a4ae4e4d0 xz: Clean up a comment 2024-05-13 15:39:39 +03:00
Lasse Collin 2de80494ed xz: Add clarifying assertions 2024-05-13 15:39:39 +03:00
Lasse Collin 1eaad004bf xz: Add a clarifying assertion
Fixes: 5f0c5a0438
2024-05-13 15:39:39 +03:00
Lasse Collin 605094329b xz: Clarify a comment 2024-05-13 15:39:39 +03:00
Lasse Collin 8fac2577f2 xz: Use the info collected in parse_block_list()
This is slightly simpler and it avoids looping through
the opt_block_list array.
2024-05-13 15:39:39 +03:00
Lasse Collin 81d350dab8 xz: Remember the filter chains and the largest Block in parse_block_list() 2024-05-13 15:39:39 +03:00
Lasse Collin 46ab56968f xz: Update a comment and initialization of filters_used_mask 2024-05-13 15:39:39 +03:00
Lasse Collin e89293a0ba xz: parse_block_list: Edit integer type casting 2024-05-13 15:39:39 +03:00
Lasse Collin 87011e40c1 xz: Make filter_memusages a local variable 2024-05-13 15:39:12 +03:00
Lasse Collin 347b412a93 xz: Remove unused code and simplify
opt_mode == MODE_COMPRESS isn't possible when HAVE_ENCODERS isn't
defined. Thus, when *encoding*, the message about *decoder* memory
usage is possible to show only when both encoder and decoder have
been built.

Since the message is shown only at V_DEBUG, skip the memusage
calculation if verbosity level isn't high enough.

Fixes: 5f0c5a0438
2024-05-13 15:31:15 +03:00
Lasse Collin 31358c057c xz: Fix integer type from uint64_t to uint32_t
lzma_options_lzma.dict_size is uint32_t so use it here too.

Fixes: 5f0c5a0438
2024-05-11 00:29:24 +03:00
Lasse Collin e4780244a1 xz: Edit comments and coding style 2024-05-07 13:12:17 +03:00
Lasse Collin fe4d8b0c80 xz: Omit an incorrect comment
It likely was a leftover from a development version of the code.

Fixes: 183819bfd9
2024-05-06 23:09:13 +03:00
Lasse Collin 9bef5b8d17 xz: Add braces to a for-statement and to an if-statement
No functional changes.

Fixes: 5f0c5a0438
Fixes: 479fd58d60
2024-05-06 23:04:31 +03:00
Lasse Collin de06b9f0c0 liblzma: Omit an unneeded array from the x86 filter
Fixes: 6aa2a6deeb
2024-05-06 23:00:09 +03:00
Lasse Collin 3334c71d3d xzdec: Support Landlock ABI version 4
This was added to xz in 02e3505991
but I forgot to do the same in xzdec.

The Landlock sandbox in xzdec could be stricter as now it's
active only for the last file being decompressed. In xz,
read-only sandbox is used for multi-file case. On the other hand,
xz doesn't go to the strictest mode when processing the last file
when more than one file was specified; xzdec does.
2024-04-30 22:24:13 +03:00
Lasse Collin 278563ef8f liblzma: Fix incorrect function type error from sanitizer
Clang 17 with -fsanitize=address,undefined:

    src/liblzma/common/filter_common.c:366:8: runtime error:
        call to function encoder_find through pointer to incorrect
        function type 'const lzma_filter_coder *(*)(unsigned long)'
    src/liblzma/common/filter_encoder.c:187: note:
        encoder_find defined here

Use a wrapper function to get the correct type neatly.
This reduces the number of casts needed too.

This issue could be a problem with control flow integrity (CFI)
methods that check the function type on indirect function calls.

Fixes: 3b34851de1
2024-04-30 22:22:45 +03:00
Lasse Collin 77c8f60547 xz: Avoid arithmetic on a null pointer
It's undefined behavior. The result wasn't ever used as it occurred
in the last iteration of a loop.

Clang 17 with -fsanitize=address,undefined:

    $ src/xz/xz --block-list=123
    src/xz/args.c:164:12: runtime error: applying non-zero offset 1
        to null pointer

Fixes: 88ccf47205
Co-authored-by: Sam James <sam@gentoo.org>
2024-04-30 21:41:11 +03:00
Lasse Collin e21efdf96f Build: Add --enable-doxygen to generate and install API docs
It requires Doxygen. This option is disabled by default.
2024-04-30 17:09:08 +03:00
Lasse Collin 71eed2520e liblzma: index_decoder: Fix missing initializations on LZMA_PROG_ERROR
If the arguments to lzma_index_decoder() or lzma_index_buffer_decode()
were such that LZMA_PROG_ERROR was returned, the lzma_index **i
argument wasn't touched even though the API docs say that *i = NULL
is done if an error occurs. This obviously won't be done even now
if i == NULL but otherwise it is best to do it due to the wording
in the API docs.

In practice this matters very little: The problem can occur only
if the functions are called with invalid arguments, that is,
the calling application must already have a bug.
2024-04-27 14:33:38 +03:00
Sam James c7ef767c49 liblzma: outqueue: add header guard
Reported by github's codeql.
2024-04-25 14:04:24 +03:00
Sam James 55dcae3056 liblzma: easy_preset: add header guard
Reported by github's codeql.
2024-04-25 14:04:24 +03:00
Lasse Collin 4ffc60f323 tuklib_integer: Rename bswapXX to byteswapXX
The __builtin_bswapXX from GCC and Clang are preferred when
they are available. This can allow compilers to emit the x86 MOVBE
instruction instead of doing a load + byteswap as two instructions
(which would happen if the byteswapping is done in inline asm).

bswap16, bswap32, and bswap64 exist in system headers on *BSDs
and Darwin. #defining bswap16 on NetBSD results in a warning about
macro redefinition. It's safest to avoid this namespace conflict
completely.

No OS supported by tuklib_integer.h uses byteswapXX names and
a web search doesn't immediately find any obvious danger of
namespace conflicts. So let's try these still-pretty-short names
for the macros.

Thanks to Sam James for pointing out the compiler warning on
NetBSD 10.0.
2024-04-25 14:00:57 +03:00