Commit Graph

1757 Commits

Author SHA1 Message Date
Lasse Collin 30e95bb44c liblzma: Avoid null pointer + 0 (undefined behavior in C).
In the C99 and C17 standards, section 6.5.6 paragraph 8 means that
adding 0 to a null pointer is undefined behavior. As of writing,
"clang -fsanitize=undefined" (Clang 15) diagnoses this. However,
I'm not aware of any compiler that would take advantage of this
when optimizing (Clang 15 included). It's good to avoid this anyway
since compilers might some day infer that pointer arithmetic implies
that the pointer is not NULL. That is, the following foo() would then
unconditionally return 0, even for foo(NULL, 0):

    void bar(char *a, char *b);

    int foo(char *a, size_t n)
    {
        bar(a, a + n);
        return a == NULL;
    }

In contrast to C, C++ explicitly allows null pointer + 0. So if
the above is compiled as C++ then there is no undefined behavior
in the foo(NULL, 0) call.

To me it seems that changing the C standard would be the sane
thing to do (just add one sentence) as it would ensure that a huge
amount of old code won't break in the future. Based on web searches
it seems that a large number of codebases (where null pointer + 0
occurs) are being fixed instead to be future-proof in case compilers
will some day optimize based on it (like making the above foo(NULL, 0)
return 0) which in the worst case will cause security bugs.

Some projects don't plan to change it. For example, gnulib and thus
many GNU tools currently require that null pointer + 0 is defined:

    https://lists.gnu.org/archive/html/bug-gnulib/2021-11/msg00000.html

    https://www.gnu.org/software/gnulib/manual/html_node/Other-portability-assumptions.html

In XZ Utils null pointer + 0 issue should be fixed after this
commit. This adds a few if-statements and thus branches to avoid
null pointer + 0. These check for size > 0 instead of ptr != NULL
because this way bugs where size > 0 && ptr == NULL will likely
get caught quickly. None of them are in hot spots so it shouldn't
matter for performance.

A little less readable version would be replacing

    ptr + offset

with

    offset != 0 ? ptr + offset : ptr

or creating a macro for it:

    #define my_ptr_add(ptr, offset) \
            ((offset) != 0 ? ((ptr) + (offset)) : (ptr))

Checking for offset != 0 instead of ptr != NULL allows GCC >= 8.1,
Clang >= 7, and Clang-based ICX to optimize it to the very same code
as ptr + offset. That is, it won't create a branch. So for hot code
this could be a good solution to avoid null pointer + 0. Unfortunately
other compilers like ICC 2021 or MSVC 19.33 (VS2022) will create a
branch from my_ptr_add().

Thanks to Marcin Kowalczyk for reporting the problem:
https://github.com/tukaani-project/xz/issues/36
2023-02-23 20:41:22 +02:00
Jia Tan fa9065fac5 liblzma: Adjust container.h for consistency with filter.h. 2023-02-23 20:27:59 +08:00
Jia Tan 00a721b63d liblzma: Fix small typos and reword a few things in filter.h. 2023-02-23 20:27:59 +08:00
Jia Tan 5b1c171d4f liblzma: Convert list of flags in lzma_mt to bulleted list. 2023-02-23 20:27:59 +08:00
Jia Tan dbd47622eb liblzma: Fix typo in documentation in container.h
lzma_microlzma_decoder -> lzma_microlzma_encoder
2023-02-23 20:27:59 +08:00
Jia Tan 14cd30806d liblzma: Improve documentation for container.h
Standardizing each function to always specify parameters and return
values. Also moved the parameters and return values to the end of each
function description.
2023-02-23 20:27:59 +08:00
Jia Tan c9c8bfae35 CMake: Add LZIP decoder test to list of tests. 2023-02-22 21:10:28 +08:00
Lasse Collin b9f171dd00 Update THANKS. 2023-02-17 20:56:49 +02:00
Lasse Collin 2ee86d20e4 Build: Use only the generic symbol versioning on MicroBlaze.
On MicroBlaze, GCC 12 is broken in sense that
__has_attribute(__symver__) returns true but it still doesn't
support the __symver__ attribute even though the platform is ELF
and symbol versioning is supported if using the traditional
__asm__(".symver ...") method. Avoiding the traditional method is
good because it breaks LTO (-flto) builds with GCC.

See also: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101766

For now the only extra symbols in liblzma_linux.map are the
compatibility symbols with the patch that spread from RHEL/CentOS 7.
These require the use of __symver__ attribute or __asm__(".symver ...")
in the C code. Compatibility with the patch from CentOS 7 doesn't
seem valuable on MicroBlaze so use liblzma_generic.map on MicroBlaze
instead. It doesn't require anything special in the C code and thus
no LTO issues either.

An alternative would be to detect support for __symver__
attribute in configure.ac and CMakeLists.txt and fall back
to __asm__(".symver ...") but then LTO would be silently broken
on MicroBlaze. It sounds likely that MicroBlaze is a special
case so let's treat it as a such because that is simpler. If
a similar issue exists on some other platform too then hopefully
someone will report it and this can be reconsidered.

(This doesn't do the same fix in CMakeLists.txt. Perhaps it should
but perhaps CMake build of liblzma doesn't matter much on MicroBlaze.
The problem breaks the build so it's easy to notice and can be fixed
later.)

Thanks to Vincent Fazio for reporting the problem and proposing
a patch (in the end that solution wasn't used):
https://github.com/tukaani-project/xz/pull/32
2023-02-17 20:48:28 +02:00
Lasse Collin d831072cce liblzma: Very minor API doc tweaks.
Use "member" to refer to struct members as that's the term used
by the C standard.

Use lzma_options_delta.dist and such in docs so that in Doxygen's
HTML output they will link to the doc of the struct member.

Clean up a few trailing white spaces too.
2023-02-16 21:09:00 +02:00
Jia Tan f029daea39 liblzma: Adjust spacing in doc headers in bcj.h. 2023-02-17 00:54:33 +08:00
Jia Tan a5de68bac2 liblzma: Adjust documentation in bcj.h for consistent style. 2023-02-17 00:49:47 +08:00
Jia Tan efa498c13b liblzma: Rename field => member in documentation.
Also adjusted preset value => preset level.
2023-02-17 00:49:47 +08:00
Lasse Collin 718b22a6c5 liblzma: Silence a warning from MSVC.
It gives C4146 here since unary minus with unsigned integer
is still unsigned (which is the intention here). Doing it
with substraction makes it clearer and avoids the warning.

Thanks to Nathan Moinvaziri for reporting this.
2023-02-16 17:59:50 +02:00
Jia Tan 87c53553fa liblzma: Improve documentation for stream_flags.h
Standardizing each function to always specify parameters and return
values. Also moved the parameters and return values to the end of each
function description.

A few small things were reworded and long sentences broken up.
2023-02-16 21:04:54 +08:00
Jia Tan 13d99e75a5 liblzma: Improve documentation in lzma12.h.
All functions now explicitly specify parameter and return values.
2023-02-15 22:21:44 +08:00
Jia Tan 43ec344c86 liblzma: Improve documentation in check.h.
All functions now explicitly specify parameter and return values.
Also moved the note about SHA-256 functions not being exported to the
top of the file.
2023-02-15 00:59:16 +08:00
Jia Tan 9c71db4e88 liblzma: Improve documentation in index.h
All functions now explicitly specify parameter and return values.
2023-02-15 00:20:44 +08:00
Jia Tan 421f2f2e16 liblzma: Reword a comment in index.h. 2023-02-15 00:20:44 +08:00
Jia Tan b675394849 liblzma: Omit lzma_index_iter's internal field from Doxygen docs.
Add \private above this field and its sub-fields since it is not meant
to be modified by users.
2023-02-15 00:20:44 +08:00
Jia Tan 0c9e4fc2ad liblzma: Fix documentation for LZMA_MEMLIMIT_ERROR.
LZMA_MEMLIMIT_ERROR was missing the "<" character needed to put
documentation after a member.
2023-02-14 20:41:05 +08:00
Jia Tan 816fec125a liblzma: Improve documentation for base.h.
Standardizing each function to always specify params and return values.
Also fixed a small grammar mistake.
2023-02-14 20:41:05 +08:00
Jia Tan 862dacef1a liblzma: Add one more missing [out] annotation in vli.h 2023-02-14 00:12:34 +08:00
Jia Tan 867b08ae42 liblzma: Minor improvements to vli.h.
Added [out] annotations to parameters that are pointers and can have
their value changed. Also added a clarification to lzma_vli_is_valid.
2023-02-14 00:08:33 +08:00
Jia Tan 90d0e628ff liblzma: Add comments for macros in delta.h.
Document LZMA_DELTA_DIST_MIN and LZMA_DELTA_DIST_MAX for completeness
and to avoid Doxygen warnings.
2023-02-10 21:38:25 +08:00
Jia Tan 9255fffdb1 liblzma: Improve documentation in index_hash.h.
All functions now explicitly specify parameter and return values.
Also reworded the description of lzma_index_hash_init() for readability.
2023-02-10 21:35:23 +08:00
Lasse Collin 1dbe12b90c xz: Improve the comment about start_time in mytime.c.
start_time is relative to an arbitary point in time, it's not
time of day, so using it for anything else than time differences
wouldn't make sense.
2023-02-07 19:07:45 +02:00
Jia Tan 7673ef5aa8 Build: Adjust CMake version search regex.
Now, the LZMA_VERSION_MAJOR, LZMA_VERSION_MINOR, and LZMA_VERSION_PATCH
macros do not need to be on consecutive lines in version.h. They can be
separated by more whitespace, comments, or even other content, as long
as they appear in the proper order (major, minor, patch).
2023-02-04 21:06:35 +08:00
Jia Tan b8bce89be7 xz: Add a comment clarifying the use of start_time in mytime.c. 2023-02-04 20:11:51 +08:00
Jia Tan 912af91b10 liblzma: Improve documentation for version.h.
Specified parameter and return values for API functions and documented
a few more of the macros.
2023-02-04 20:11:36 +08:00
Jia Tan 850adec171 Docs: Omit SIGTSTP not handled from TODO. 2023-02-03 22:52:55 +08:00
Jia Tan 2c78a83c6f liblzma: Fix bug in lzma_str_from_filters() not checking filters[] length.
The bug is only a problem in applications that do not properly terminate
the filters[] array with LZMA_VLI_UNKNOWN or have more than
LZMA_FILTERS_MAX filters. This bug does not affect xz.
2023-02-03 00:42:27 +08:00
Jia Tan e01f01b9af Tests: Create test_filter_str.c.
Tests lzma_str_to_filters(), lzma_str_from_filters(), and
lzma_str_list_filters() API functions.
2023-02-03 00:42:27 +08:00
Jia Tan 8dfc029e7a liblzma: Fix typos in comments in string_conversion.c. 2023-02-03 00:42:27 +08:00
Jia Tan 54ad83c1ae liblzma: Clarify block encoder and decoder documentation.
Added a few sentences to the description for lzma_block_encoder() and
lzma_block_decoder() to highlight that the Block Header must be coded
before calling these functions.
2023-02-03 00:22:53 +08:00
Jia Tan f680e771b3 Update lzma_block documentation for lzma_block_uncomp_encode(). 2023-02-03 00:22:53 +08:00
Jia Tan 504cf4af89 liblzma: Minor edits to lzma_block header_size documentation. 2023-02-03 00:22:53 +08:00
Jia Tan 115b720fb5 liblzma: Enumerate functions that read version in lzma_block. 2023-02-03 00:22:53 +08:00
Jia Tan 85ea0979ad liblzma: Clarify comment in block.h. 2023-02-03 00:22:53 +08:00
Jia Tan 1f7ab90d9c liblzma: Improve documentation for block.h.
Standardizing each function to always specify params and return values.
Output pointer parameters are also marked with doxygen style [out] to
make it clear. Any note sections were also moved above the parameter and
return sections for consistency.
2023-02-03 00:22:53 +08:00
Jia Tan c563a4bc55 liblzma: Clarify a comment about LZMA_STR_NO_VALIDATION.
The flag description for LZMA_STR_NO_VALIDATION was previously confusing
about the treatment for filters than cannot be used with .xz format
(lzma1) without using LZMA_STR_ALL_FILTERS. Now, it is clear that
LZMA_STR_NO_VALIDATION is not a super set of LZMA_STR_ALL_FILTERS.
2023-02-01 23:39:45 +08:00
Jia Tan 315c64c7e1 CI: Update .gitignore for artifacts directory in build-aux.
The workflow action for our CI pipeline can only reference artifacts in
the source directory, so we should ignore these files if the ci_build.sh
is run locally.
2023-02-01 21:47:35 +08:00
Jia Tan 2c1341f4fa CI: Add quotes around variables in a few places. 2023-02-01 21:47:35 +08:00
Jia Tan 3a401b0e0c CI: Upload test logs as artifacts if a test fails. 2023-02-01 21:47:35 +08:00
Lasse Collin 610dde15a8 xz: Use clock_gettime() even if CLOCK_MONOTONIC isn't available.
mythread.h and thus liblzma already does it.
2023-01-27 20:02:49 +02:00
Lasse Collin 2e02877288 po4a/po4a.conf: Sort the language identifiers in alphabetical order. 2023-01-27 19:41:19 +02:00
Lasse Collin ff592c616e xz: Add SIGTSTP handler for progress indicator time keeping.
This way, if xz is stopped the elapsed time and estimated time
remaining won't get confused by the amount of time spent in
the stopped state.

This raises SIGSTOP. It's not clear to me if this is the correct way.
POSIX and glibc docs say that SIGTSTP shouldn't stop the process if
it is orphaned but this commit doesn't attempt to handle that.

Search for SIGTSTP in section 2.4.3:

https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html
2023-01-27 19:37:47 +02:00
Jia Tan 3b1c8ac8d1 Translations: Add Brazilian Portuguese translation of man pages.
Thanks to Rafael Fontenelle.
2023-01-27 20:16:54 +08:00
Lasse Collin a15a7552f9 Build: Avoid different quoting style in --enable-doxygen doc. 2023-01-26 17:51:06 +02:00
Lasse Collin af5a4bd5af tuklib_physmem: Check for __has_warning before GCC version.
Clang can be configured to fake a too high GCC version so
this way it's more robust.
2023-01-26 17:39:46 +02:00