Clang 16.0.0 and earlier have a bug that the ifunc resolver function
triggers the -Wunused-function warning. The resolver function is static
and only "used" by the __attribute__((__ifunc()__)).
At this time, the bug is still unresolved, but has been reported:
https://github.com/llvm/llvm-project/issues/63957
This is not a problem in GCC.
This further improves the documentation from commit
f36ca7982f. The previous wording of
"supported options" was slightly misleading since the options that are
printed are the ones that are relevant for encoding/decoding. It is not
about which options can or must be specified.
The Memory limit information section described three output
columns when it actually has six. This was reworded to
"multiple" to make it more future proof.
* Moved max_block_list_size from a global to local variable.
* Reworded error message in validate_block_list_filter().
* Removed helper function filter_chain_error().
* Changed 1 << X to 1U << X in many places
The order is now consistent with the order the command line arguments
are documented earlier in the man page. The new order is:
1. --list
2. --info-memory
3. --version
Instead of the previous order:
1. --version
2. --info-memory
3. --list
The --filters-help can be used to help create filter chains with the
--filters and --filtersX options. The message in --long-help is too
short to fully explain the syntax to construct complex filter chains.
In --robot mode, xz will only print the output from liblzma function
lzma_str_list_filters.
The --block-list option description needed updating since the new
--filtersX option changes how it can be used. The new entry for
--filters1=FILTERS ... --filter9=FILTERS was created right after
the --filters option.
If a filter chain is set but not used in --block-list, it introduced
unexpected behavior such as requiring an unneeded amount of memory to
compress, reducing the number of threads in multi-threaded encoding, and
printing an incorrect amount of memory needed to decompress.
This also renames filters_init_mask => filters_used_mask. A filter is
assumed to be used if it is specified in --filtersX until
coder_set_compression_settings() determines which filters are referenced
in --block-list.
When opt_block_size is not used, the Block size for mt encoder is
derived from the minimum of the largest Block specified by
--block-list and the recommended Block size on all filter chains
calculated by lzma_mt_block_size(). This avoids using unnecessary
memory and ensures that all Blocks are large enough for the most memory
needy filter chain.
Previously, only the default filter chain could have its memory usage
adjusted. The filter chains specified with --filtersX were not checked
for memory usage. Now, all used filter chains will be adjusted if
necessary.
The block splitting logic and split_block() function are not needed if
encoders are disabled. This will help slightly reduce the binary size
when built without encoders and allow split_block() to use functions
that require encoders being enabled.
This will only free filter chains created with --filters1-9 since the
default filter chain may be set from a static function variable. The
complexity to free the default filter chain is not worth the burden on
code maintenance.
The new command line options are meant to be combined with --block-list.
They work as an optional extension to --block-list to specify a custom
filter chain for each block listed. The new options allow the creation
of up to 9 reusable filter chains. For instance:
xz --block-list=1:10MiB,3:5MiB,,2:5MiB,1:0 --filters1=delta--lzma2 \
--filters2=x86--lzma2 --filters3=arm64--lzma2
Will create the following blocks:
1. A block of size 10 MiB with filter chain delta, lzma2.
2. A block of size 5 MiB with filter chain arm64, lzma2.
3. A block of size 5 MiB with filter chain arm64, lzma2.
4. A block of size 5 MiB with filter chain x86, lzma2.
5. A block containing the rest of the file contents with filter chain
delta, lzma2.
This is a little cleaner than the previous implementation of
forget_filter_chain(). It is also more consistent since
lzma_str_to_filters() will always terminate the filter chain so there
is no need to terminate it later in coder_set_compression_settings().
The --filters option uses the new lzma_str_to_filters() function
to convert a string into a full filter chain. Using this option
will reset all previous filters set by --preset, --[filter], or
--filters.
Commit 78704f36e7 added an empty
initializer {} to prevent a warning. The empty initializer is a GNU
extension and results in a build failure on MSVC. The -wpedantic flag
warns about empty initializers.
This change only impacts the compiler warning since it was impossible
for the wait_abs struct in stream_encode_mt() to be used before it was
initialized since mythread_condtime_set() will always be called before
mythread_cond_timedwait().
Since the mythread.h code is different between the POSIX and
Windows versions, this warning was only present on Windows builds.
Thanks to Arthur S for reporting the warning and providing an initial
patch.
In lzma_memcmplen(), the <intrin.h> header file is only included if
_MSC_VER and _M_X64 are both defined but _BitScanForward64() was
previously used if _M_X64 was defined. GCC for MSYS2 defines _M_X64 but
not _MSC_VER so _BitScanForward64() was used without including
<intrin.h>.
Now, lzma_memcmplen() will use __builtin_ctzll() for MSYS2 GCC builds as
expected.
The ifunc method avoids indirection via the function pointer
crc64_func. This works on GNU/Linux and probably on FreeBSD too.
The previous __attribute((__constructor__)) method is kept for
compatibility with ELF platforms which do support ifunc.
The ifunc method has some limitations, for example, building
liblzma with -fsanitize=address will result in segfaults.
The configure option --disable-ifunc must be used for such builds.
Thanks to Hans Jansen for the original patch.
Closes: https://github.com/tukaani-project/xz/pull/53
Reword "options required" to "supported options". The previous may have
suggested that the options listed were all required anytime a filter is
used for encoding or decoding. The reword makes this more clear that
adjusting the options is optional.
The lzma_mt_block_size() was previously just an internal function for
the multithreaded .xz encoder. It is used to provide a recommended Block
size for a given filter chain.
This function is helpful to determine the maximum Block size for the
multithreaded .xz encoder when one wants to change the filters between
blocks. Then, this determined Block size can be provided to
lzma_stream_encoder_mt() in the lzma_mt options parameter when
intializing the coder. This requires one to know all the filter chains
they are using before starting to encode (or at least the filter chain
that will need the largest Block size), but that isn't a bad limitation.
Legacy Windows did not need to #include <intrin.h> to use the MSVC
intrinsics. Newer versions likely just issue a warning, but the MSVC
documentation says to include the header file for the intrinsics we use.
GCC and Clang can "pretend" to be MSVC on Windows, so extra checks are
needed in tuklib_integer.h to only include <intrin.h> when it will is
actually needed.
Clang has support for __builtin_clz(), but previously Clang would
fallback to either the MSVC intrinsic or the regular C code. This was
discovered due to a bug where a new version of Clang required the
<intrin.h> header file in order to use the MSVC intrinsics.
Thanks to Anton Kochkov for notifying us about the bug.
The \mainpage command is used in the first block of comments in lzma.h.
This changes the previously nearly empty index.html to use the first
comment block in lzma.h for its contents.
lzma.h is no longer documented separately, but this is for the better
since lzma.h only defined a few macros that users do not need to use.
The individual API header files all have a disclaimer that they should
not be #included directly, so there should be no confusion on the fact
that lzma.h should be the only header used by applications.
Additionally, the note "See ../lzma.h for information about liblzma as
a whole." was removed since lzma.h is now the main page of the
generated HTML and does not have its own page anymore. So it would be
confusing in the HTML version and was only a "nice to have" when
browsing the source files.