Commit Graph

570 Commits

Author SHA1 Message Date
Dimitri Papadopoulos Orfanos 42df7c7aa1
Docs: Fix typos found by codespell 2023-07-31 20:02:21 +08:00
Jia Tan d9166b52cf liblzma: Prevent an empty translation unit in Windows builds.
To workaround Automake lacking Windows resource compiler support, an
empty source file is compiled to overwrite the resource files for static
library builds. Translation units without an external declaration are
not allowed by the C standard and result in a warning when used with
-Wempty-translation-unit (Clang) or -pedantic (GCC).
2023-07-24 23:11:13 +08:00
Jia Tan 0184d344fa liblzma: Suppress -Wunused-function warning.
Clang 16.0.0 and earlier have a bug that the ifunc resolver function
triggers the -Wunused-function warning. The resolver function is static
and only "used" by the __attribute__((__ifunc()__)).

At this time, the bug is still unresolved, but has been reported:
https://github.com/llvm/llvm-project/issues/63957

This is not a problem in GCC.
2023-07-19 23:36:00 +08:00
Jia Tan 43845fa70f liblzma: Reword lzma_str_list_filters() documentation.
This further improves the documentation from commit
f36ca7982f. The previous wording of
"supported options" was slightly misleading since the options that are
printed are the ones that are relevant for encoding/decoding. It is not
about which options can or must be specified.
2023-07-18 22:57:58 +08:00
Jia Tan 818701ba1c liblzma: Improve comment in string_conversion.c.
The comment used "flag" when referring to decoder options. Just
referring to them as options is more clear and consistent.
2023-07-18 22:56:47 +08:00
Lasse Collin 97fd5cb669 liblzma: Tweak #if condition in memcmplen.h.
Maybe ICC always #defines _MSC_VER on Windows but now
it's very clear which code will get used.
2023-07-18 13:57:54 +03:00
Lasse Collin 40392c19f7 liblzma: Omit unnecessary parenthesis in a preprocessor directive. 2023-07-18 13:49:43 +03:00
Jia Tan 17f8844e6f liblzma: Remove non-portable empty initializer.
Commit 78704f36e7 added an empty
initializer {} to prevent a warning. The empty initializer is a GNU
extension and results in a build failure on MSVC. The -wpedantic flag
warns about empty initializers.
2023-07-08 21:24:19 +08:00
Jia Tan 78704f36e7 liblzma: Prevent uninitialzed warning in mt stream encoder.
This change only impacts the compiler warning since it was impossible
for the wait_abs struct in stream_encode_mt() to be used before it was
initialized since mythread_condtime_set() will always be called before
mythread_cond_timedwait().

Since the mythread.h code is different between the POSIX and
Windows versions, this warning was only present on Windows builds.

Thanks to Arthur S for reporting the warning and providing an initial
patch.
2023-06-29 00:06:16 +08:00
Jia Tan e3356a204c liblzma: Prevent warning for MSYS2 Windows build.
In lzma_memcmplen(), the <intrin.h> header file is only included if
_MSC_VER and _M_X64 are both defined but _BitScanForward64() was
previously used if _M_X64 was defined. GCC for MSYS2 defines _M_X64 but
not _MSC_VER so _BitScanForward64() was used without including
<intrin.h>.

Now, lzma_memcmplen() will use __builtin_ctzll() for MSYS2 GCC builds as
expected.
2023-06-28 23:59:51 +08:00
Lasse Collin ee44863ae8 liblzma: Add ifunc implementation to crc64_fast.c.
The ifunc method avoids indirection via the function pointer
crc64_func. This works on GNU/Linux and probably on FreeBSD too.
The previous __attribute((__constructor__)) method is kept for
compatibility with ELF platforms which do support ifunc.

The ifunc method has some limitations, for example, building
liblzma with -fsanitize=address will result in segfaults.
The configure option --disable-ifunc must be used for such builds.

Thanks to Hans Jansen for the original patch.
Closes: https://github.com/tukaani-project/xz/pull/53
2023-06-27 23:55:59 +08:00
Jia Tan f36ca7982f liblzma: Slightly rewords lzma_str_list_filters() documentation.
Reword "options required" to "supported options". The previous may have
suggested that the options listed were all required anytime a filter is
used for encoding or decoding. The reword makes this more clear that
adjusting the options is optional.
2023-05-13 21:21:54 +08:00
Jia Tan 3374a5359e liblzma: Adds lzma_nothrow to MicroLZMA API functions.
None of the liblzma functions may throw an exception, so this
attribute should be applied to all liblzma API functions.
2023-05-12 00:00:47 +08:00
Jia Tan 8f23657498 liblzma: Exports lzma_mt_block_size() as an API function.
The lzma_mt_block_size() was previously just an internal function for
the multithreaded .xz encoder. It is used to provide a recommended Block
size for a given filter chain.

This function is helpful to determine the maximum Block size for the
multithreaded .xz encoder when one wants to change the filters between
blocks. Then, this determined Block size can be provided to
lzma_stream_encoder_mt() in the lzma_mt options parameter when
intializing the coder. This requires one to know all the filter chains
they are using before starting to encode (or at least the filter chain
that will need the largest Block size), but that isn't a bad limitation.
2023-05-11 23:54:44 +08:00
Jia Tan d0f33d672a liblzma: Creates IS_ENC_DICT_SIZE_VALID() macro.
This creates an internal liblzma macro to test if the dictionary size
is valid for encoding.
2023-05-11 22:28:45 +08:00
Jia Tan f41df2ac2f Windows: Include <intrin.h> when needed.
Legacy Windows did not need to #include <intrin.h> to use the MSVC
intrinsics. Newer versions likely just issue a warning, but the MSVC
documentation says to include the header file for the intrinsics we use.

GCC and Clang can "pretend" to be MSVC on Windows, so extra checks are
needed in tuklib_integer.h to only include <intrin.h> when it will is
actually needed.
2023-04-19 22:22:16 +08:00
Lasse Collin 3938718ce3 liblzma: Update project maintainers in lzma.h.
AUTHORS was updated earlier, lzma.h was simply forgotten.
2023-04-14 18:42:33 +03:00
Jia Tan 2a89670ab2 liblzma: Cleans up old commented out code. 2023-04-13 20:45:19 +08:00
Jia Tan 116e81f002 Build: Removes redundant check for LZMA1 filter support. 2023-03-23 21:48:52 +08:00
Lasse Collin dfe1710784 liblzma: Silence -Wsign-conversion in SSE2 code in memcmplen.h.
Thanks to Christian Hesse for reporting the issue.
Fixes: https://github.com/tukaani-project/xz/issues/44
2023-03-19 22:45:59 +02:00
Lasse Collin b473a92891 Change a few HTTP URLs to HTTPS.
The xz man page timestamp was intentionally left unchanged.
2023-03-18 15:56:07 +02:00
Jia Tan fd90e2f4c2 liblzma: Remove note from lzma_options_bcj about the ARM64 exception.
This was left in by mistake since an early version of the ARM64 filter
used a different struct for its options.
2023-03-17 01:42:28 +08:00
Jia Tan 55ba6e9300 liblzma: Add set lzma.h as the main page for Doxygen documentation.
The \mainpage command is used in the first block of comments in lzma.h.
This changes the previously nearly empty index.html to use the first
comment block in lzma.h for its contents.

lzma.h is no longer documented separately, but this is for the better
since lzma.h only defined a few macros that users do not need to use.
The individual API header files all have a disclaimer that they should
not be #included directly, so there should be no confusion on the fact
that lzma.h should be the only header used by applications.

Additionally, the note "See ../lzma.h for information about liblzma as
a whole." was removed since lzma.h is now the main page of the
generated HTML and does not have its own page anymore. So it would be
confusing in the HTML version and was only a "nice to have" when
browsing the source files.
2023-03-17 01:42:28 +08:00
Jia Tan af55191102 liblzma: Defines masks for return values from lzma_index_checks(). 2023-03-13 20:49:53 +08:00
Jia Tan f1ab1f6b33 liblzma: Clarify lzma_lzma_preset() documentation in lzma12.h.
lzma_lzma_preset() does not guarentee that the lzma_options_lzma are
usable in an encoder even if it returns false (success). If liblzma
is built with default configurations, then the options will always be
usable. However if the match finders hc3, hc4, or bt4 are disabled, then
the options may not be usable depending on the preset level requested.

The documentation was updated to reflect this complexity, since this
behavior was unclear before.
2023-03-01 21:42:31 +08:00
Jia Tan 3cf72c4bcb liblzma: Replace '\n' -> newline in filter.h documentation.
The '\n' renders as a newline when the comments are converted to html
by Doxygen.
2023-02-24 21:09:39 +08:00
Jia Tan 002006be62 liblzma: Shorten return description for two functions in filter.h.
Shorten the description for lzma_raw_encoder_memusage() and
lzma_raw_decoder_memusage().
2023-02-24 21:09:39 +08:00
Jia Tan 463d9359b8 liblzma: Reword a few lines in filter.h 2023-02-24 21:09:39 +08:00
Jia Tan 01441df92c liblzma: Improve documentation in filter.h.
All functions now explicitly specify parameter and return values.
The notes and code annotations were moved before the parameter and
return value descriptions for consistency.

Also, the description above lzma_filter_encoder_is_supported() about
not being able to list available filters was removed since
lzma_str_list_filters() will do this.
2023-02-24 21:09:39 +08:00
Lasse Collin 30e95bb44c liblzma: Avoid null pointer + 0 (undefined behavior in C).
In the C99 and C17 standards, section 6.5.6 paragraph 8 means that
adding 0 to a null pointer is undefined behavior. As of writing,
"clang -fsanitize=undefined" (Clang 15) diagnoses this. However,
I'm not aware of any compiler that would take advantage of this
when optimizing (Clang 15 included). It's good to avoid this anyway
since compilers might some day infer that pointer arithmetic implies
that the pointer is not NULL. That is, the following foo() would then
unconditionally return 0, even for foo(NULL, 0):

    void bar(char *a, char *b);

    int foo(char *a, size_t n)
    {
        bar(a, a + n);
        return a == NULL;
    }

In contrast to C, C++ explicitly allows null pointer + 0. So if
the above is compiled as C++ then there is no undefined behavior
in the foo(NULL, 0) call.

To me it seems that changing the C standard would be the sane
thing to do (just add one sentence) as it would ensure that a huge
amount of old code won't break in the future. Based on web searches
it seems that a large number of codebases (where null pointer + 0
occurs) are being fixed instead to be future-proof in case compilers
will some day optimize based on it (like making the above foo(NULL, 0)
return 0) which in the worst case will cause security bugs.

Some projects don't plan to change it. For example, gnulib and thus
many GNU tools currently require that null pointer + 0 is defined:

    https://lists.gnu.org/archive/html/bug-gnulib/2021-11/msg00000.html

    https://www.gnu.org/software/gnulib/manual/html_node/Other-portability-assumptions.html

In XZ Utils null pointer + 0 issue should be fixed after this
commit. This adds a few if-statements and thus branches to avoid
null pointer + 0. These check for size > 0 instead of ptr != NULL
because this way bugs where size > 0 && ptr == NULL will likely
get caught quickly. None of them are in hot spots so it shouldn't
matter for performance.

A little less readable version would be replacing

    ptr + offset

with

    offset != 0 ? ptr + offset : ptr

or creating a macro for it:

    #define my_ptr_add(ptr, offset) \
            ((offset) != 0 ? ((ptr) + (offset)) : (ptr))

Checking for offset != 0 instead of ptr != NULL allows GCC >= 8.1,
Clang >= 7, and Clang-based ICX to optimize it to the very same code
as ptr + offset. That is, it won't create a branch. So for hot code
this could be a good solution to avoid null pointer + 0. Unfortunately
other compilers like ICC 2021 or MSVC 19.33 (VS2022) will create a
branch from my_ptr_add().

Thanks to Marcin Kowalczyk for reporting the problem:
https://github.com/tukaani-project/xz/issues/36
2023-02-23 20:41:22 +02:00
Jia Tan fa9065fac5 liblzma: Adjust container.h for consistency with filter.h. 2023-02-23 20:27:59 +08:00
Jia Tan 00a721b63d liblzma: Fix small typos and reword a few things in filter.h. 2023-02-23 20:27:59 +08:00
Jia Tan 5b1c171d4f liblzma: Convert list of flags in lzma_mt to bulleted list. 2023-02-23 20:27:59 +08:00
Jia Tan dbd47622eb liblzma: Fix typo in documentation in container.h
lzma_microlzma_decoder -> lzma_microlzma_encoder
2023-02-23 20:27:59 +08:00
Jia Tan 14cd30806d liblzma: Improve documentation for container.h
Standardizing each function to always specify parameters and return
values. Also moved the parameters and return values to the end of each
function description.
2023-02-23 20:27:59 +08:00
Lasse Collin d831072cce liblzma: Very minor API doc tweaks.
Use "member" to refer to struct members as that's the term used
by the C standard.

Use lzma_options_delta.dist and such in docs so that in Doxygen's
HTML output they will link to the doc of the struct member.

Clean up a few trailing white spaces too.
2023-02-16 21:09:00 +02:00
Jia Tan f029daea39 liblzma: Adjust spacing in doc headers in bcj.h. 2023-02-17 00:54:33 +08:00
Jia Tan a5de68bac2 liblzma: Adjust documentation in bcj.h for consistent style. 2023-02-17 00:49:47 +08:00
Jia Tan efa498c13b liblzma: Rename field => member in documentation.
Also adjusted preset value => preset level.
2023-02-17 00:49:47 +08:00
Lasse Collin 718b22a6c5 liblzma: Silence a warning from MSVC.
It gives C4146 here since unary minus with unsigned integer
is still unsigned (which is the intention here). Doing it
with substraction makes it clearer and avoids the warning.

Thanks to Nathan Moinvaziri for reporting this.
2023-02-16 17:59:50 +02:00
Jia Tan 87c53553fa liblzma: Improve documentation for stream_flags.h
Standardizing each function to always specify parameters and return
values. Also moved the parameters and return values to the end of each
function description.

A few small things were reworded and long sentences broken up.
2023-02-16 21:04:54 +08:00
Jia Tan 13d99e75a5 liblzma: Improve documentation in lzma12.h.
All functions now explicitly specify parameter and return values.
2023-02-15 22:21:44 +08:00
Jia Tan 43ec344c86 liblzma: Improve documentation in check.h.
All functions now explicitly specify parameter and return values.
Also moved the note about SHA-256 functions not being exported to the
top of the file.
2023-02-15 00:59:16 +08:00
Jia Tan 9c71db4e88 liblzma: Improve documentation in index.h
All functions now explicitly specify parameter and return values.
2023-02-15 00:20:44 +08:00
Jia Tan 421f2f2e16 liblzma: Reword a comment in index.h. 2023-02-15 00:20:44 +08:00
Jia Tan b675394849 liblzma: Omit lzma_index_iter's internal field from Doxygen docs.
Add \private above this field and its sub-fields since it is not meant
to be modified by users.
2023-02-15 00:20:44 +08:00
Jia Tan 0c9e4fc2ad liblzma: Fix documentation for LZMA_MEMLIMIT_ERROR.
LZMA_MEMLIMIT_ERROR was missing the "<" character needed to put
documentation after a member.
2023-02-14 20:41:05 +08:00
Jia Tan 816fec125a liblzma: Improve documentation for base.h.
Standardizing each function to always specify params and return values.
Also fixed a small grammar mistake.
2023-02-14 20:41:05 +08:00
Jia Tan 862dacef1a liblzma: Add one more missing [out] annotation in vli.h 2023-02-14 00:12:34 +08:00
Jia Tan 867b08ae42 liblzma: Minor improvements to vli.h.
Added [out] annotations to parameters that are pointers and can have
their value changed. Also added a clarification to lzma_vli_is_valid.
2023-02-14 00:08:33 +08:00