Commit Graph

879 Commits

Author SHA1 Message Date
Lasse Collin a3de1b841e liblzma: Move a few __attribute__ uses in function declarations.
The API headers have many attributes but these were left
as is for now.

(cherry picked from commit e3478ae4f3)
2024-05-07 15:58:21 +03:00
Lasse Collin 737318447a xz, xzdec, lzmainfo: Use tuklib_attr_noreturn.
For compatibility with C23's [[noreturn]], tuklib_attr_noreturn
must be at the beginning of declaration (before "extern" or
"static", and even before any GNU C's __attribute__).

This commit also moves all other function attributes to
the beginning of function declarations. "extern" is kept
at the beginning of a line so the attributes are listed on
separate lines before "extern" or "static".

(cherry picked from commit b71b8922ef)
2024-05-07 15:58:20 +03:00
Lasse Collin 015e62b18d Remove incorrect uses of __attribute__((__malloc__)).
xrealloc() is obviously incorrect, modern GCC docs even
mention realloc() as an example where this attribute
cannot be used.

liblzma's lzma_alloc() and lzma_alloc_zero() would be
correct uses most of the time but custom allocators
may use a memory pool or otherwise hold the pointer
so aliasing issues could happen in theory.

The xstrdup() case likely was correct but I removed it anyway.
Now there are no __malloc__ attributes left in the code.
The allocations aren't in hot paths so this should make
no practical difference.

(cherry picked from commit 359e5c6cb1)
2024-05-07 15:44:54 +03:00
Lasse Collin df8daea282 xz: Fix a too relaxed assertion and remove uses of SSIZE_MAX.
SSIZE_MAX isn't readily available on MSVC. Removing it means
that there is one thing less to worry when porting to MSVC.

(cherry picked from commit ef71f83973)
2024-05-07 15:32:03 +03:00
Jia Tan 519896fc94 liblzma: Update assert in vli_ceil4().
The argument to vli_ceil4() should always guarantee the return value
is also a valid lzma_vli. Thus the highest three valid lzma_vli values
are invalid arguments. All uses of the function ensure this so the
assert is updated to match this.

(cherry picked from commit 773f1e8622)
2024-05-07 15:31:30 +03:00
Jia Tan 591ac56d42 liblzma: Add overflow check for Unpadded size in lzma_index_append().
This was not a security bug since there was no path to overflow
UINT64_MAX in lzma_index_append() or when it calls index_file_size().
The bug was discovered by a failing assert() in vli_ceil4() when called
from index_file_size() when unpadded_sum (the sum of the compressed size
of current Stream and the unpadded_size parameter) exceeds LZMA_VLI_MAX.

Previously, the unpadded_size parameter was checked to be not greater
than UNPADDED_SIZE_MAX, but no check was done once compressed_base was
added.

This could not have caused an integer overflow in index_file_size() when
called by lzma_index_append(). The calculation for file_size breaks down
into the sum of:

- Compressed base from all previous Streams
- 2 * LZMA_STREAM_HEADER_SIZE (size of the current Streams header and
  footer)
- stream_padding (can be set by lzma_index_stream_padding())
- Compressed base from the current Stream
- Unpadded size (parameter to lzma_index_append())

The sum of everything except for Unpadded size must be less than
LZMA_VLI_MAX. This is guarenteed by overflow checks in the functions
that can set these values including lzma_index_stream_padding(),
lzma_index_append(), and lzma_index_cat(). The maximum value for
Unpadded size is enforced by lzma_index_append() to be less than or
equal UNPADDED_SIZE_MAX. Thus, the sum cannot exceed UINT64_MAX since
LZMA_VLI_MAX is half of UINT64_MAX.

Thanks to Joona Kannisto for reporting this.

(cherry picked from commit 68bda971bb)
2024-05-07 15:31:30 +03:00
Jamaika1 ec0d5c99c3 mythread.h: Fix typo error in Vista threads mythread_once().
The "once_" variable was accidentally referred to as just "once". This
prevented building with Vista threads when
HAVE_FUNC_ATTRIBUTE_CONSTRUCTOR was not defined.

(cherry picked from commit c0c0cd4a48)
2024-05-07 15:30:38 +03:00
Jia Tan 9d4bf2d06f liblzma: Prevent an empty translation unit in Windows builds.
To workaround Automake lacking Windows resource compiler support, an
empty source file is compiled to overwrite the resource files for static
library builds. Translation units without an external declaration are
not allowed by the C standard and result in a warning when used with
-Wempty-translation-unit (Clang) or -pedantic (GCC).

(cherry picked from commit 19899340cf)
2024-05-07 15:28:35 +03:00
Lasse Collin 5a87d91321 liblzma: Tweak #if condition in memcmplen.h.
Maybe ICC always #defines _MSC_VER on Windows but now
it's very clear which code will get used.

(cherry picked from commit b406828a6d)
2024-05-07 15:28:35 +03:00
Lasse Collin 0c53f52657 liblzma: Omit unnecessary parenthesis in a preprocessor directive.
(cherry picked from commit ef4a07ad94)
2024-05-07 15:28:35 +03:00
Jia Tan eede1df4af liblzma: Prevent warning for MSYS2 Windows build.
In lzma_memcmplen(), the <intrin.h> header file is only included if
_MSC_VER and _M_X64 are both defined but _BitScanForward64() was
previously used if _M_X64 was defined. GCC for MSYS2 defines _M_X64 but
not _MSC_VER so _BitScanForward64() was used without including
<intrin.h>.

Now, lzma_memcmplen() will use __builtin_ctzll() for MSYS2 GCC builds as
expected.

(cherry picked from commit 64ee0caaea)
2024-05-07 15:28:35 +03:00
Jia Tan 5f9bf81044 liblzma: Prevent uninitialzed warning in mt stream encoder.
This change only impacts the compiler warning since it was impossible
for the wait_abs struct in stream_encode_mt() to be used before it was
initialized since mythread_condtime_set() will always be called before
mythread_cond_timedwait().

Since the mythread.h code is different between the POSIX and
Windows versions, this warning was only present on Windows builds.

Thanks to Arthur S for reporting the warning and providing an initial
patch.

(cherry picked from commit 1155471651)
2024-05-07 15:23:51 +03:00
Jia Tan 774145adfd Bump version and soname for 5.2.12. 2023-05-04 21:55:11 +08:00
Lasse Collin 809a2fd698 tuklib_integer.h: Fix a recent copypaste error in Clang detection.
Wrong line was changed in 7062348bf3.
Also, this has >= instead of == since ints larger than 32 bits would
work too even if not relevant in practice.
2023-05-03 22:55:59 +03:00
Jia Tan b02e74eb73 Windows: Include <intrin.h> when needed.
Legacy Windows did not need to #include <intrin.h> to use the MSVC
intrinsics. Newer versions likely just issue a warning, but the MSVC
documentation says to include the header file for the intrinsics we use.

GCC and Clang can "pretend" to be MSVC on Windows, so extra checks are
needed in tuklib_integer.h to only include <intrin.h> when it will is
actually needed.
2023-05-03 22:33:10 +03:00
Jia Tan 8efb6ea63b tuklib_integer: Use __builtin_clz() with Clang.
Clang has support for __builtin_clz(), but previously Clang would
fallback to either the MSVC intrinsic or the regular C code. This was
discovered due to a bug where a new version of Clang required the
<intrin.h> header file in order to use the MSVC intrinsics.

Thanks to Anton Kochkov for notifying us about the bug.
2023-05-03 22:33:10 +03:00
Lasse Collin 3bd906f1f3 liblzma: Update project maintainers in lzma.h.
AUTHORS was updated earlier, lzma.h was simply forgotten.
2023-05-03 22:33:10 +03:00
Jia Tan 0ab5527c46 liblzma: Cleans up old commented out code. 2023-05-03 22:33:10 +03:00
Jia Tan 275e36013d Build: Removes redundant check for LZMA1 filter support. 2023-05-03 22:33:10 +03:00
Jia Tan 3e206e5c43 Bump version and soname for 5.2.11. 2023-03-18 22:25:59 +08:00
Lasse Collin 090ea9ddd3 Change a few HTTP URLs to HTTPS.
The xz man page timestamp was intentionally left unchanged.
2023-03-18 22:00:28 +08:00
Lasse Collin 09363bea46 liblzma: Avoid null pointer + 0 (undefined behavior in C).
In the C99 and C17 standards, section 6.5.6 paragraph 8 means that
adding 0 to a null pointer is undefined behavior. As of writing,
"clang -fsanitize=undefined" (Clang 15) diagnoses this. However,
I'm not aware of any compiler that would take advantage of this
when optimizing (Clang 15 included). It's good to avoid this anyway
since compilers might some day infer that pointer arithmetic implies
that the pointer is not NULL. That is, the following foo() would then
unconditionally return 0, even for foo(NULL, 0):

    void bar(char *a, char *b);

    int foo(char *a, size_t n)
    {
        bar(a, a + n);
        return a == NULL;
    }

In contrast to C, C++ explicitly allows null pointer + 0. So if
the above is compiled as C++ then there is no undefined behavior
in the foo(NULL, 0) call.

To me it seems that changing the C standard would be the sane
thing to do (just add one sentence) as it would ensure that a huge
amount of old code won't break in the future. Based on web searches
it seems that a large number of codebases (where null pointer + 0
occurs) are being fixed instead to be future-proof in case compilers
will some day optimize based on it (like making the above foo(NULL, 0)
return 0) which in the worst case will cause security bugs.

Some projects don't plan to change it. For example, gnulib and thus
many GNU tools currently require that null pointer + 0 is defined:

    https://lists.gnu.org/archive/html/bug-gnulib/2021-11/msg00000.html

    https://www.gnu.org/software/gnulib/manual/html_node/Other-portability-assumptions.html

In XZ Utils null pointer + 0 issue should be fixed after this
commit. This adds a few if-statements and thus branches to avoid
null pointer + 0. These check for size > 0 instead of ptr != NULL
because this way bugs where size > 0 && ptr == NULL will likely
get caught quickly. None of them are in hot spots so it shouldn't
matter for performance.

A little less readable version would be replacing

    ptr + offset

with

    offset != 0 ? ptr + offset : ptr

or creating a macro for it:

    #define my_ptr_add(ptr, offset) \
            ((offset) != 0 ? ((ptr) + (offset)) : (ptr))

Checking for offset != 0 instead of ptr != NULL allows GCC >= 8.1,
Clang >= 7, and Clang-based ICX to optimize it to the very same code
as ptr + offset. That is, it won't create a branch. So for hot code
this could be a good solution to avoid null pointer + 0. Unfortunately
other compilers like ICC 2021 or MSVC 19.33 (VS2022) will create a
branch from my_ptr_add().

Thanks to Marcin Kowalczyk for reporting the problem:
https://github.com/tukaani-project/xz/issues/36
2023-03-11 21:47:47 +02:00
Jia Tan 050c6dbf96 liblzma: Fix documentation for LZMA_MEMLIMIT_ERROR.
LZMA_MEMLIMIT_ERROR was missing the "<" character needed to put
documentation after a member.
2023-03-11 21:45:26 +02:00
Jia Tan 8daaac8e10 tuklib_physmem: Silence warning from -Wcast-function-type on MinGW-w64.
tuklib_physmem depends on GetProcAddress() for both MSVC and MinGW-w64
to retrieve a function address. The proper way to do this is to cast the
return value to the type of function pointer retrieved. Unfortunately,
this causes a cast-function-type warning, so the best solution is to
simply ignore the warning.
2023-03-11 21:45:26 +02:00
Jia Tan 6c9a2c2e46 xz: Add missing comment for coder_set_compression_settings() 2023-03-11 21:45:26 +02:00
Jia Tan ccbb991efa xz: Do not set compression settings with raw format in list mode.
Calling coder_set_compression_settings() in list mode with verbose mode
on caused the filter chain and memory requirements to print. This was
unnecessary since the command results in an error and not consistent
with other formats like lzma and alone.
2023-03-11 21:45:26 +02:00
Lasse Collin 6df383be4a xz: Use ssize_t for the to-be-ignored return value from write(fd, ptr, 1).
It makes no difference here as the return value fits into an int
too and it then gets ignored but this looks better.
2023-03-11 21:45:26 +02:00
Lasse Collin 2ca95b7cfe liblzma: Silence warnings from clang -Wconditional-uninitialized.
This is similar to 2ce4f36f17.
The actual initialization of the variables is done inside
mythread_sync() macro. Clang doesn't seem to see that
the initialization code inside the macro is always executed.
2023-03-11 21:38:31 +02:00
Lasse Collin f900dd937f Fix warnings from clang -Wdocumentation. 2023-03-11 21:37:49 +02:00
Jia Tan 3e2b345cfd xz: Fix warning -Wformat-nonliteral on clang in message.c.
clang and gcc differ in how they handle -Wformat-nonliteral. gcc will
allow a non-literal format string as long as the function takes its
format arguments as a va_list.
2023-03-11 21:37:49 +02:00
Jia Tan 2155fef528 liblzma: Update documentation for lzma_filter_encoder. 2023-03-11 21:34:26 +02:00
Lasse Collin f7c2cc5561 Bump version and soname for 5.2.10. 2022-12-13 13:03:20 +02:00
Lasse Collin 59a17888e9 xz: Make args_info.files_name a const pointer. 2022-12-12 15:53:03 +02:00
Lasse Collin 4af80d4f51 xz: Don't modify argv[].
The code that parses --memlimit options and --block-list modified
the argv[] when parsing the option string from optarg. This was
visible in "ps auxf" and such and could be confusing. I didn't
understand it back in the day when I wrote that code. Now a copy
is allocated when modifiable strings are needed.
2022-12-12 15:53:01 +02:00
Lasse Collin 7623b22d1d liblzma: Check for unexpected NULL pointers in block_header_decode().
The API docs gave an impression that such checks are done
but they actually weren't done. In practice it made little
difference since the calling code has a bug if these are NULL.

Thanks to Jia Tan for the original patch that checked for
block->filters == NULL.
2022-12-12 15:47:17 +02:00
Lasse Collin ef315163ef liblzma: Use __has_attribute(__symver__) to fix Clang detection.
If someone sets up Clang to define __GNUC__ to 10 or greater
then symvers broke. __has_attribute is supported by such GCC
and Clang versions that don't support __symver__ so this should
be much better and simpler way to detect if __symver__ is
actually supported.

Thanks to Tomasz Gajc for the bug report.
2022-12-12 15:47:17 +02:00
Lasse Collin d8a898eb99 Bump version and soname for 5.2.9. 2022-11-30 18:33:05 +02:00
Lasse Collin 841448e36d liblzma: Remove two FIXME comments. 2022-11-29 10:46:06 +02:00
Lasse Collin b61da00c7f Build: Don't put GNU/Linux-specific symbol versions into static liblzma.
It not only makes no sense to put symbol versions into a static library
but it can also cause breakage.

By default Libtool #defines PIC if building a shared library and
doesn't define it for static libraries. This is documented in the
Libtool manual. It can be overriden using --with-pic or --without-pic.
configure.ac detects if --with-pic or --without-pic is used and then
gives an error if neither --disable-shared nor --disable-static was
used at the same time. Thus, in normal situations it works to build
both shared and static library at the same time on GNU/Linux,
only --with-pic or --without-pic requires that only one type of
library is built.

Thanks to John Paul Adrian Glaubitz from Debian for reporting
the problem that occurred on ia64:
https://www.mail-archive.com/xz-devel@tukaani.org/msg00610.html
2022-11-24 23:50:46 +02:00
Lasse Collin 872623def5 liblzma: Fix another invalid free() after memory allocation failure.
This time it can happen when lzma_stream_encoder_mt() is used
to reinitialize an existing multi-threaded Stream encoder
and one of 1-4 tiny allocations in lzma_filters_copy() fail.

It's very similar to the previous bug
10430fbf38, happening with
an array of lzma_filter structures whose old options are freed
but the replacement never arrives due to a memory allocation
failure in lzma_filters_copy().
2022-11-24 10:58:04 +02:00
Jia Tan b0f8d9293c liblzma: Add support for LZMA_SYNC_FLUSH in the Block encoder.
The documentation mentions that lzma_block_encoder() supports
LZMA_SYNC_FLUSH but it was never added to supported_actions[]
in the internal structure. Because of this, LZMA_SYNC_FLUSH could
not be used with the Block encoder unless it was the next coder
after something like stream_encoder() or stream_encoder_mt().
2022-11-24 10:58:04 +02:00
Lasse Collin 6997e0b5e2 liblzma: Add lzma_attr_warn_unused_result to lzma_filters_copy(). 2022-11-24 10:58:04 +02:00
Lasse Collin f94a3e3460 liblzma: Fix invalid free() after memory allocation failure.
The bug was in the single-threaded .xz Stream encoder
in the code that is used for both re-initialization and for
lzma_filters_update(). To trigger it, an application had
to either re-initialize an existing encoder instance with
lzma_stream_encoder() or use lzma_filters_update(), and
then one of the 1-4 tiny allocations in lzma_filters_copy()
(called from stream_encoder_update()) must fail. An error
was correctly reported but the encoder state was corrupted.

This is related to the recent fix in
f8ee61e74e which is good but
it wasn't enough to fix the main problem in stream_encoder.c.
2022-11-24 10:58:04 +02:00
Lasse Collin 8309385b44 liblzma: Fix language in a comment. 2022-11-24 10:57:11 +02:00
Lasse Collin 5fecba6022 liblzma: Fix infinite loop in LZMA encoder init with dict_size >= 2 GiB.
The encoder doesn't support dictionary sizes larger than 1536 MiB.
This is validated, for example, when calculating the memory usage
via lzma_raw_encoder_memusage(). It is also enforced by the LZ
part of the encoder initialization. However, LZMA encoder with
LZMA_MODE_NORMAL did an unsafe calculation with dict_size before
such validation and that results in an infinite loop if dict_size
was 2 << 30 or greater.
2022-11-24 10:57:03 +02:00
Lasse Collin 1946b2b141 liblzma: Fix two Doxygen commands in the API headers.
These were caught by clang -Wdocumentation.
2022-11-24 10:56:50 +02:00
Lasse Collin 5476089d9c Bump version and soname for 5.2.8. 2022-11-13 19:58:47 +02:00
Lasse Collin 454f567e58 liblzma: Fix building with Intel ICC (the classic compiler).
It claims __GNUC__ >= 10 but doesn't support __symver__ attribute.

Thanks to Stephen Sachs.
2022-11-11 17:16:19 +02:00
Lasse Collin 2f01169f5a liblzma: Fix incorrect #ifdef for x86 SSE2 support.
__SSE2__ is the correct macro for SSE2 support with GCC, Clang,
and ICC. __SSE2_MATH__ means doing floating point math with SSE2
instead of 387. Often the latter macro is defined if the first
one is but it was still a bug.
2022-11-11 14:36:32 +02:00
Lasse Collin fc1358679e Scripts: Ignore warnings from xz.
In practice this means making the scripts work when
the input files have an unsupported check type which
isn't a problem in practice unless support for
some check types has been disabled at build time.
2022-11-11 13:50:56 +02:00