1
0
mirror of https://git.tukaani.org/xz.git synced 2025-12-11 16:08:45 +00:00

2965 Commits

Author SHA1 Message Date
Lasse Collin
7971566247
Autotools: Autodetect unaligned access support on LoongArch
According to [1] sections 7.4, 8.1, and 8.2, desktop and server
processors support fast unaligned access, but embedded systems likely
don't.

It's important that TUKLIB_FAST_UNALIGNED_ACCESS isn't defined when
-mstrict-align is in use because it will result in slower binaries
even if running on a processor that supports fast unaligned access.
It's because compilers will translate multibyte memcpy() to multiple
byte-by-byte instructions instead of wider loads and stores. The
compression times from [2] show this well:

    Unaligned access    CFLAGS                     Compression time
        enabled         -O2 -mno-strict-align          66.1 s
        disabled        -O2 -mno-strict-align          79.5 s
        disabled        -O2 -mstrict-align             79.9 s
        enabled         -O2 -mstrict-align            129.1 s

There currently (GCC 15.2) is no preprocessor macro on LoongArch
to detect if -mstrict-align or -mno-strict-align is in effect (the
default is -mno-strict-align). Use heuristics to detect which of the
flags is in effect.

[1] https://github.com/loongson/la-softdev-convention/blob/v0.2/la-softdev-convention.adoc
[2] https://github.com/tukaani-project/xz/pull/186#issuecomment-3494570304

Thanks-to: Li Chenggang <lichenggang@deepin.org>
Thanks-to: Xi Ruoyao
See: https://github.com/tukaani-project/xz/pull/186
2025-12-09 17:18:22 +02:00
Lasse Collin
338f952c00
xz: Silence clang -Wunreachable-code-break
Fixes: a165d7df1964 ("xz: Add a new --filters-help option.")
2025-12-09 17:18:22 +02:00
Lasse Collin
723cee44d0
liblzma: Remove unwanted semicolons
These didn't affect control flow.
2025-12-09 17:18:22 +02:00
Lasse Collin
524f6a7384
Tests: Remove an unwanted semicolon from a macro definition
It didn't affect control flow.

Fixes: fe3bd438fb11 ("Tests: Fix memory leaks in test_block_header.")
2025-12-09 17:18:22 +02:00
Lasse Collin
0f41a28bfa
Build: Use -Wextra-semi-stmt when supported 2025-12-09 17:18:22 +02:00
Lasse Collin
91170c8cab
CI: Add clang-cl
Fixes: https://github.com/tukaani-project/xz/issues/18#issuecomment-3577456136
2025-12-09 17:18:22 +02:00
Lasse Collin
a3c6cb0911
xz/Windows: Add a missing #include to fix the build with clang-cl
Fixes: https://github.com/tukaani-project/xz/issues/18#issuecomment-1986829734
Fixes: https://github.com/tukaani-project/xz/issues/18#issuecomment-3577456136
2025-12-09 17:18:22 +02:00
Lasse Collin
c410ccc625
xz: Check return value of sigaction() before calling raise()
Fixes: Coverity CID 456022
2025-12-09 17:18:21 +02:00
Lasse Collin
6cc2da0a4b
liblzma: Refactor a loop in lzma_filters_copy()
Arguably it's nicer if i doesn't wrap around when the loop terminates.

Fixes: Coverity CID 464589
Fixes: 6d118a0b9def ("Add lzma_filters_copy().")
2025-12-09 17:18:21 +02:00
Lasse Collin
90b67853d5
liblzma: Silence two Coverity warnings
lzma_lzma_decoder_memusage() returns UINT64_MAX if lc/lp/pb aren't
valid. alone_decoder.c and lzip_decoder.c didn't check the return
value because in both it is known that lc/lp/pb are valid. Make them
call the _nocheck() variant instead which skips the validation (it
already existed for LZMA2's internal use).

Fixes: Coverity CID 464658
Fixes: Coverity CID 897069
2025-12-09 17:18:21 +02:00
Lasse Collin
be365b7010
liblzma: Fix a harmless read of shared variable without mutex
The partial_update_mode enumeration had three states, _DISABLED,
_START, and _ENABLED. Main thread changed it from _DISABLED to _START
while holding a mutex. Once set to _START, worker thread changed it
to _ENABLED without a mutex. Later main thread read it without a mutex,
so it could see either _START or _ENABLED. However, it made no
difference because the main thread checked for != _DISABLED, so
it didn't matter if it saw _START or _ENABLED.

Nevertheless, such things must not be done. It's clear it was a mistake
because there were two comments that directly contradicted each
other about how the variable was accessed.

Split the enumeration into two booleans:

  - partial_update_enabled: A worker thread locks the mutex to read
    this variable and the main thread locks the mutex to change the
    value. Because only the main thread modifies the variable, the
    main thread can read the value without locking the mutex.
    This variable replaces the _DISABLED -> _START transition.

  - partial_update_started is for worker thread's internal use and thus
    needs no mutex. This replaces the _START -> _ENABLED transition.

Fixes: Coverity CID 456025
Fixes: bd93b776c1bd ("liblzma: Fix a deadlock in threaded decoder.")
2025-12-09 17:18:21 +02:00
Lasse Collin
2686554da0
CI: Add Coverity Scan
Co-authored-by: Lasse Collin <lasse.collin@tukaani.org>
Fixes: https://github.com/tukaani-project/xz/issues/198
2025-12-09 17:18:21 +02:00
Lasse Collin
1b30734c9c
Change the sorting order in THANKS
In short, sort the names with this command (-k1,1 isn't needed because
the lines with names start with "  -"):

    LC_ALL=en_US.UTF-8 sort -k2,2 -k3,3 -k4,4 -k5,5

When THANKS was created, I wrote the names as "First Last" and attempted
to keep them sorted by last name / surname / family name. This works
with many names in THANKS, but it becomes complicated with names that
don't fit that pattern. For example, names that are written as
"Last First" can be manually sorted by family name, but only if one
knows which part of the name is the family name.[*] And of course,
the concept of first/last name doesn't apply to all names.

[*] xz had a co-maintainer who could help me with such names,
    but fortunately he isn't working on the project anymore.

Adding the names in chronological order could have worked too, although
if something is contributed by multiple people, one would still have to
decide how to sort the names within the batch. Another downside would
be that if THANKS is updated in more than one work-in-progress branch,
merge conflicts would occur more often.

Don't attempt to sort by last name. Let's be happy that people tend to
provide names that can be expressed in a reasonable number of printable
Unicode characters. In practice, people have been even nicer: if the
native language doesn't use a Latin script alphabet, people often provide
a transliterated name (only or in addition to the original spelling),
which is very much appreciated by those who don't know the native script.

Treat the names as opaque strings or space-separated strings for sorting
purposes. This means that most names will now be sorted by first name.
There still are many choices how to sort:

(1) LC_ALL=en_US.UTF-8 sort

    The project is in English, so this may sound like a logical choice.
    However, spaces have a lower weight than letters, which results in
    this order:

        - A Ba
        - Ab C
        - A Bc
        - A Bd

(2) LC_ALL=en_US.UTF-8 sort -k2,2

    This first sorts by the first word and then by the rest of the
    string. It's -k2,2 instead of -k1,1 to skip the leading dash.

        - A Ba
        - A Bc
        - A Bd
        - Ab C

    I like this more than (1). One could add -k3,3 -k4,4 -k5,5 ... too.
    With current THANKS it makes no difference but it might some day.

    NOTE: The ordering in en_US.UTF-8 can differ between libc versions
    and operating systems. Luckily it's not a big deal in THANKS.

(3) LC_ALL=en_US.UTF-8 sort -f -k2,2

    Passing -f (--ignore-case) to sort affects sorting of single-byte
    characters but not multibyte characters (GNU coreutils 9.9):

        No -f       With -f     LC_ALL=C
        Aa          A.A         A.A
        A.A         Aa          Aa
        Ää          Ää          Ä.Ä
        Ä.Ä         Ä.Ä         Ää

    In GNU coreutils, the THANKS file is sorted using "sort -f -k1,1".
    There is also a basic check that the en_US.UTF-8 locale is
    behaving as expected.

(4) LC_ALL=C sort

    This sorts by byte order which in UTF-8 is the same as Unicode
    code point order. With the strings in (1) and (2), this produces
    the same result as in (2). The difference in (3) can be seen above.

    The results differ from en_US.UTF-8 when a name component starts
    with a lower case ASCII letter (like "von" or "de"). Worse, any
    non-ASCII characters sort after ASCII chars. These properties might
    look weird in English language text, although it's good to remember
    that en_US.UTF-8 sorting can appear weird too if one's native
    language isn't English.

The choice between (2) and (4) was difficult but I went with (2).

;-)
2025-12-09 17:18:09 +02:00
Lasse Collin
8bb516887c
Landlock: Add missing #ifdefs
The build was broken on distros that have an old <sys/landlock.h>.

Fixes: 2b2652e914b1 ("Landlock: Workaround a bug in RHEL 9 kernel")
2025-11-23 20:39:28 +02:00
Lasse Collin
23c95c6a7c
Update THANKS 2025-11-23 20:13:50 +02:00
Lasse Collin
2b2652e914
Landlock: Workaround a bug in RHEL 9 kernel
If one runs xz 5.8.0 or 5.8.1 from some other distribution in a container
on RHEL 9, xz will fail with the message "Failed to enable the sandbox".

RHEL 9 kernel since 5.14.0-603.el9 (2025-07-30) claims to support
Landlock ABI version 6, but it lacks support for LANDLOCK_SCOPE_SIGNAL.
The issue is still present in 5.14.0-643.el9 (2025-11-22). Red Hat is
aware of the issue, but I don't know when it will be fixed.

The sandbox is meant to be transparent to users, thus there isn't and
won't be a command line option to disable it. Instead, add a workaround
to keep xz working on the buggy RHEL 9 kernels.

Reported-by: Richard W.M. Jones
Thanks-to: Pavel Raiskup
Tested-by: Orgad Shaneh
Tested-by: Richard W.M. Jones
Fixes: https://github.com/tukaani-project/xz/issues/199
Link: https://issues.redhat.com/browse/RHEL-125143
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2407105
Link: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/message/65BDSY56R5ZJRTUC4B6CIVCVLY4LG4ME/
2025-11-23 20:13:49 +02:00
Lasse Collin
ee75c76958
Landlock: Cache the ABI version
In xz it can avoid up to two syscalls that query the ABI version.
2025-11-23 20:13:37 +02:00
Lasse Collin
f57b1716cd
Update THANKS 2025-11-03 14:52:45 +02:00
Lasse Collin
211cde0923
mythread.h: Fix the build on Emscripten when threading is disabled
To make a non-threaded liblzma-only build work with WASI SDK, <signal.h>
and mythread_sigmask() were omitted from mythread.h in the commit
81db3b889830. This broke non-threaded full build with Emscripten because
src/xz/signals.c needs mythread_sigmask() (liblzma-only build was fine).

If __wasm__ is defined, omit <signal.h> and mythread_sigmask() in
non-threaded builds only when __EMSCRIPTEN__ isn't defined.

Reported-by: Marcus Tillmanns
Thanks-to: ChanTsune
Fixes: https://github.com/tukaani-project/xz/issues/161
Fixes: 81db3b889830 ("mythread.h: Disable signal functions in builds targeting Wasm + WASI.")
2025-11-03 14:48:15 +02:00
Lasse Collin
cbf50a99e3
Translations: Update the Serbian man page translations
The earlier bug fixes are now included in the Translation Project.
2025-11-03 11:56:32 +02:00
Lasse Collin
68d1591187
xz: Silence a compiler warning when signals_block_count is unused
Move the static variable signals_block_count to the #ifndef block
that already has the functions that need the variable.
2025-11-02 19:40:55 +02:00
Lasse Collin
beca015891
xz: Silence a warning from Clang on glibc systems
Fixes: e8838b2f5922 ("xz: Look at resource limits when determining the default memlimit")
2025-11-02 17:45:20 +02:00
Lasse Collin
3e394278ed
Translations: Update the Serbian man page translations
Preserve the bug fixes made in 71ad5e82888f and 4f52e7387012 because
upstream hasn't included them.
2025-11-02 14:37:52 +02:00
Lasse Collin
ace28e3573
Translations: Update the Korean man page translations 2025-11-02 14:27:10 +02:00
Lasse Collin
ffd14a099f
Translations: Update the Italian man page translations 2025-11-02 14:24:43 +02:00
Lasse Collin
6f3152874e
Translations: Update the Ukrainian man page translations 2025-11-02 14:12:23 +02:00
Lasse Collin
ef67e051d7
liblzma: Fix build on old Linux/glibc on ARM64
getauxval() can be available even if HWCAP_CRC32 isn't #defined, so
both have to be checked. HWCAP_CRC32 was added in glibc 2.24 (2016).

Fixes: https://github.com/tukaani-project/xz/issues/190
2025-10-31 19:21:48 +02:00
Lasse Collin
71c2ede383
CI: Update Solaris 2025-10-31 14:44:24 +02:00
Lasse Collin
02da8de0ed
CI: Update DragonFly BSD 2025-10-31 14:44:24 +02:00
Lasse Collin
75b18d325f
CI: Update NetBSD 2025-10-31 14:44:24 +02:00
Lasse Collin
0102072915
CI: Update FreeBSD 2025-10-31 14:44:24 +02:00
Lasse Collin
83419783a6
CI: Update OpenBSD 2025-10-31 14:44:18 +02:00
Lasse Collin
3b5f5af9bc
Update THANKS 2025-10-31 12:43:43 +02:00
Kirill A. Korinsky
e8838b2f59
xz: Look at resource limits when determining the default memlimit
When no memory usage limits have been set by the user, the default
for multithreaded mode has been 1/4 of total RAM. If this limit is
too high and memory allocation fails, liblzma (and xz) fail. Perhaps
liblzma should handle it better by reducing the number of threads
and continuing with the amount of memory it can allocate, but currently
that isn't the case.

If resource limits were set to about 1/4 of RAM or lower, then xz
could fail for the above reason. This commit makes xz look at
RLIMIT_DATA, RLIMIT_AS, and RLIMIT_VMEM when they are available,
and set the limit 64 MiB below the lowest of those limits. This is
more or less a hack just like the 1/4-of-RAM method is, but this is
simple and quick to implement.

On Linux, there are other limits like cgroup v2 memory.max which
can still make xz fail. The same is likely possible with FreeBSD's
rctl(8).

Co-authored-by: Lasse Collin <lasse.collin@tukaani.org>
Thanks-to: Fangrui Song
Fixes: https://github.com/tukaani-project/xz/issues/195
Closes: https://github.com/tukaani-project/xz/pull/196
2025-10-31 12:43:37 +02:00
Lasse Collin
8d26b72915
CI: Remove windows-2019 (which had VS 2019)
GitHub has removed the runner image.

A breakage with CLMUL CRC code occurred with VS 2019 but not 2022,
see b5a5d9e3f702. MS supports VS 2019 for a few more years, so it's
unfortunate that it can no longer be tested on GitHub.
2025-10-01 12:50:53 +03:00
Lasse Collin
32412bd2a4
Update THANKS 2025-09-29 19:34:58 +03:00
Lakshmi-Surekha
eaa150df98
xz: Don't fsync() directories on AIX
It fails with EBADF.

Fixes: https://github.com/tukaani-project/xz/issues/188
Closes: https://github.com/tukaani-project/xz/pull/189
2025-09-29 19:25:11 +03:00
Lasse Collin
61b114e92f
liblzma: Document that lzma_allocator.free(opaque, NULL) is possible
It feels better to fix the docs than change the code because this
way newly-written applications will be forced to be compatible with
the lzma_allocator behavior of old liblzma versions. It can matter
if someone builds the application against an older liblzma version.

Fixes: https://github.com/tukaani-project/xz/issues/183
2025-09-29 18:37:19 +03:00
Simon Josefsson
6d287a3ae9
Update GPLv2 and LGPLv2.1 copies from gnu.org
Closes: https://github.com/tukaani-project/xz/pull/194
2025-09-29 17:55:41 +03:00
Lasse Collin
41a421dbad
tests/test_suffix.sh: Avoid variables in printf format string 2025-09-29 17:50:46 +03:00
Lasse Collin
a2c6aa8764
build-aux/manconv.sh: Add quotes 2025-09-29 17:50:46 +03:00
Lasse Collin
8e4153253e
windows/build.bash: Add quotes
In this case they aren't needed but it's better style.
2025-09-29 17:50:46 +03:00
Lasse Collin
37a57a926d
po4a/update-po: Ensure that a glob won't expand to a command line option 2025-09-29 17:50:45 +03:00
Lasse Collin
e3ba73034a
liblzma: validate_map.sh: Catch some unlikely errors 2025-09-29 17:50:45 +03:00
Lasse Collin
067cecdea6
CI: Catch unsupported arguments in ci_build.bash 2025-09-29 17:50:45 +03:00
Lasse Collin
4fc6208abe
Scripts: Add shellcheck directives to silence warnings
Set also shell because the xz*.in files start with '#!@POSIX_SHELL@'.

SC1003 and SC2016 are only info messages, not warnings. Several other
shellcheck info messages remain. They are safe to ignore, but I didn't
want to disable them now.

Partially-fixes: https://github.com/tukaani-project/xz/issues/174
2025-09-29 17:50:45 +03:00
Lasse Collin
7844aff1a8
Scripts: Silence two shellcheck warnings 2025-09-29 17:50:39 +03:00
Lasse Collin
4d439aaeed
Translations: Add Swedish man page translations 2025-09-29 17:29:23 +03:00
Lasse Collin
dd4a1b2599
CI: Add timeout-minutes
Sometimes the VM workflows (like FreeBSD VM on Ubuntu) get stuck
and the default timeout is six hours. While at it, set a sensible
timeout for all workflows.
2025-05-23 13:09:14 +03:00
Lasse Collin
d660fe5d56
liblzma: Fix grammar in API docs
Fixes: a27920002dbc ("liblzma: Add generic support for input seeking (LZMA_SEEK).")
2025-05-23 12:28:17 +03:00