Even though the proper name for the architecture is aarch64, this
project uses ARM64 throughout. So the rename is for consistency.
Additionally, crc32_arm64.h was slightly refactored for the following
changes:
* Added MSVC, FreeBSD, and macOS support in
is_arch_extension_supported().
* crc32_arch_optimized() now checks the size when aligning the
buffer.
* crc32_arch_optimized() loop conditions were slightly modified to
avoid both decrementing the size and incrementing the buffer
pointer.
* Use the intrinsic wrappers defined in <arm_acle.h> because GCC and
Clang name them differently.
* Minor spacing and comment changes.
The CRC32 instructions in ARM64 can calculate the CRC32 result
for 8 bytes in a single operation, making the use of ARM64
instructions much faster compared to the general CRC32 algorithm.
Optimized CRC32 will be enabled if ARM64 has CRC extension
running on Linux.
Signed-off-by: Chenxi Mao <chenxi.mao2013@gmail.com>
A CLMUL-only build will have the crcxx_clmul() inlined into
lzma_crcxx(). Previously a jump to the extern lzma_crcxx_clmul()
was needed. Notes about shared liblzma on ELF platforms:
- On platforms that support ifunc and -fvisibility=hidden, this
was silly because CLMUL-only build would have that single extra
jump instruction of extra overhead.
- On platforms that support neither -fvisibility=hidden nor linker
version script (liblzma*.map), jumping to lzma_crcxx_clmul()
would go via PLT so a few more instructions of overhead (still
not a big issue but silly nevertheless).
There was a downside with static liblzma too: if an application only
needs lzma_crc64(), static linking would make the linker include the
CLMUL code for both CRC32 and CRC64 from crc_x86_clmul.o even though
the CRC32 code wouldn't be needed, thus increasing code size of the
executable (assuming that -ffunction-sections isn't used).
Also, now compilers are likely to inline crc_simd_body()
even if they don't support the always_inline attribute
(or MSVC's __forceinline). Quite possibly all compilers
that build the code do support such an attribute. But now
it likely isn't a problem even if the attribute wasn't supported.
Now all x86-specific stuff is in crc_x86_clmul.h. If other archs
The other archs can then have their own headers with their own
is_clmul_supported() and crcxx_clmul().
Another bonus is that the build system doesn't need to care if
crc_clmul.c is needed.
is_clmul_supported() stays as inline function as it's not needed
when doing a CLMUL-only build (avoids a warning about unused function).
The sandbox is now enabled for xzdec as well, so it no longer belongs
in just the xz section. xz and xzdec are always built, except for older
MSVC versions, so there isn't a need to conditionally show the sandbox
configuration. CMake will do a little unecessary work on older MSVC
versions that can't build xz or xzdec, but this is a very small
downside.
A CMake option LARGE_FILE_SUPPORT is created if and only if
-D_FILE_OFFSET_BITS=64 affects sizeof(off_t).
This is needed on many 32-bit platforms and even with 64-bit builds
with MinGW-w64 to get support for files larger than 2 GiB.
Autotools based build uses -pthread and thus adds it to Libs.private
in liblzma.pc. CMake doesn't use -pthread at all if pthread functions
are available in libc so Libs.private doesn't get -pthread either.
Now configure will fail if -fsanitize= is found in CFLAGS
and sanitizer-incompatible ifunc or Landlock sandboxing
would be used. These are incompatible with one or more sanitizers.
It's simpler to reject all -fsanitize= uses instead of trying to
pass those that might not cause problems.
CMake-based build was updated similarly. It lets the configuration
finish (SEND_ERROR instead of FATAL_ERROR) so that both error
messages can be seen at once.
Using set(ENABLE_THREADS "posix") is confusing because it sets
a new normal variable and leaves the cache entry with the same
name unchanged. The intent wasn't to change the cache entry so
this switches to a different variable name.
This way typos are caught quickly and compounding error messages
are avoided (a single typo could cause more than one error).
This keeps using SEND_ERROR when the system is lacking a feature
(like threading library or sandboxing method). This way the whole
configuration log will be generated in case someone wishes to
report a problem upstream.
The option is enabled by default, but will only be visible to a user
listing cache variables or using a CMake GUI application if the
immintrin.h header file is found.
This mirrors our Autotools build --disable-clmul-crc functionality.
Both crc32_clmul() and crc64_clmul() are now exported from
crc32_clmul.c as lzma_crc32_clmul() and lzma_crc64_clmul(). This
ensures that is_clmul_supported() (now lzma_is_clmul_supported()) is
not duplicated between crc32_fast.c and crc64_fast.c.
Also, it encapsulates the complexity of the CLMUL implementations into a
single file and reduces the complexity of crc32_fast.c and crc64_fast.c.
Before, CLMUL code was present in crc32_fast.c, crc64_fast.c, and
crc_common.h.
During the conversion, various cleanups were applied to code (thanks to
Lasse Collin) including:
- Require using semicolons with MASK_/L/H/LH macros.
- Variable typing and const handling improvements.
- Improvements to comments.
- Fixes to the pragmas used.
- Removed unneeded variables.
- Whitespace improvements.
- Fixed CRC_USE_GENERIC_FOR_SMALL_INPUTS handling.
- Silenced warnings and removed the need for some #pragmas
CMake doesn't set WIN32 on CYGWIN but the workaround is
probably needed on Cygwin too. Same for MSYS and MSYS2.
The workaround must not be used with Clang that is acting in
MSVC mode. This fixes it by checking for the known environments
that need the workaround instead of using "NOT MSVC".
Thanks to Martin Storsjö.
0570308ddd (commitcomment-129098431)
The Ninja Generator for CMake cannot have a custom target and its
BYPRODUCTS have the same name. This has prevented Ninja builds on
Unix-like systems since the xz symlinks were introduced in
80a1a8bb83.
llvm-windres 17.0.0 has more accurate emulation of GNU windres, so
the hack for GNU windres must now be used with llvm-windres too.
LLVM 16.0.6 has the old behavior and there likely won't be more
16.x releases. So we can simply check for >= 17.0.0.
See also:
2bcc0fdc58
Now if user-supplied CFLAGS contains -Wall -Wextra -Wpedantic
the two checks that need -Werror will still work.
At CMake side there is add_compile_options(-Wall -Wextra)
but it didn't affect the -Werror tests. So with both Autotools
and CMake only user-supplied CFLAGS could make the checks fail
when they shouldn't.
This is not a full fix as things like -Wunused-macros in
user-supplied CFLAGS will still cause problems with both
GCC and Clang.
xzdec might build with VS2013 but it hasn't been tested.
It was never supported before and VS2013 is old anyway
so for simplicity only liblzma is supported with VS2013.
Building the command line tools xz and xzdec with the combination
of CMake + Visual Studio 2015/2017/2019/2022 works now.
VS2013 update 2 should still be able to build liblzma.
VS2013 cannot build the xz command line tool because xz
needs snprintf() that roughly conforms to C99.
VS2013 is old and no extra code will be added to support it.
Thanks to Kelvin Lee and Jia Tan for testing.
There are several new policies. CMP0149 may affect the Windows SDK
version that CMake will choose by default. The new behavior is more
predictable, always choosing the latest SDK version by default.
The other new policies shouldn't affect this package.
If CMake was configured more than once, HAVE_CLOCK_GETTIME and
HAVE_CLOCK_MONOTONIC would not be set as compile definitions. The check
for librt being needed to provide HAVE_CLOCK_GETTIME was also
simplified.
The CMake build will try to create broken symlinks on Unix and Unix-like
platforms. Cygwin and MSYS2 are Unix-like, but may not be able to create
broken symlinks. The value of the CYGWIN or MSYS environment variables
determine if broken symlinks are valid.
The default for MSYS2 does not allow for broken symlinks, so the CMake
build has been broken for MSYS2 since commit
80a1a8bb83.
CMake build system will now verify if __attribute__((__ifunc__())) can be
used in the build system. If so, HAVE_FUNC_ATTRIBUTE_IFUNC will be
defined to 1.
Boost iostream uses `find_package` in quiet mode and then again uses
`find_package` with required. This second call triggers a
`add_library cannot create imported target "ZLIB::ZLIB" because another
target with the same name already exists.`
This can simply be fixed by skipping the alias part on secondary
`find_package` runs.
The thread method is now configurable for the CMake build. It matches
the Autotools build by allowing ON (pick the best threading method),
OFF (no threading), posix, win95, and vista. If both Windows and
posix threading are both available, then ON will choose Windows
threading. Windows threading will also not use:
target_link_libraries(liblzma Threads::Threads)
since on systems like MinGW-w64 it would link the posix threads
without purpose.
This allows users to change the features they build either in
CMakeCache.txt or by using a CMake GUI. The sources built for
liblzma are affected by this too, so only the necessary files
will be compiled.
Now, the LZMA_VERSION_MAJOR, LZMA_VERSION_MINOR, and LZMA_VERSION_PATCH
macros do not need to be on consecutive lines in version.h. They can be
separated by more whitespace, comments, or even other content, as long
as they appear in the proper order (major, minor, patch).
At least on some systems, GNU windres needs --use-temp-file
in addition to the \x20 hack to avoid spaces in the command line
argument. Hovever, that \x20 syntax is broken with llvm-windres
version 15.0.0 (results in "XZx20Utils") but luckily it works
with a regular space. Thus it is best to limit the workarounds
to GNU toolchain on Windows.
The command line tools cannot be built with MSVC for now but
they can be built with MinGW-w64.
Thanks to Iouri Kharon for the bug report and the original patch.
Previously, if threading was enabled HAVE_DECL_CLOCK_MONOTONIC would always
be set to 0 or 1. However, this macro was needed in xz so if xz was not
built with threading and HAVE_DECL_CLOCK_MONOTONIC was not defined but
HAVE_CLOCK_GETTIME was, it caused a warning during build. Now,
HAVE_DECL_CLOCK_MONOTONIC has been renamed to HAVE_CLOCK_MONOTONIC and
will only be set if it is 1.
It not only makes no sense to put symbol versions into a static library
but it can also cause breakage.
By default Libtool #defines PIC if building a shared library and
doesn't define it for static libraries. This is documented in the
Libtool manual. It can be overriden using --with-pic or --without-pic.
configure.ac detects if --with-pic or --without-pic is used and then
gives an error if neither --disable-shared nor --disable-static was
used at the same time. Thus, in normal situations it works to build
both shared and static library at the same time on GNU/Linux,
only --with-pic or --without-pic requires that only one type of
library is built.
Thanks to John Paul Adrian Glaubitz from Debian for reporting
the problem that occurred on ia64:
https://www.mail-archive.com/xz-devel@tukaani.org/msg00610.html
It also works on E2K as it supports these intrinsics.
On x86-64 runtime detection is used so the code keeps working on
older processors too. A CLMUL-only build can be done by using
-msse4.1 -mpclmul in CFLAGS and this will reduce the library
size since the generic implementation and its 8 KiB lookup table
will be omitted.
On 32-bit x86 this isn't used by default for now because by default
on 32-bit x86 the separate assembly file crc64_x86.S is used.
If --disable-assembler is used then this new CLMUL code is used
the same way as on 64-bit x86. However, a CLMUL-only build
(-msse4.1 -mpclmul) won't omit the 8 KiB lookup table on
32-bit x86 due to a currently-missing check for disabled
assembler usage.
The configure.ac check should be such that the code won't be
built if something in the toolchain doesn't support it but
--disable-clmul-crc option can be used to unconditionally
disable this feature.
CLMUL speeds up decompression of files that have compressed very
well (assuming CRC64 is used as a check type). It is know that
the CLMUL code is significantly slower than the generic code for
tiny inputs (especially 1-8 bytes but up to 16 bytes). If that
is a real-world problem then there is already a commented-out
variant that uses the generic version for small inputs.
Thanks to Ilya Kurdyukov for the original patch which was
derived from a white paper from Intel [1] (published in 2009)
and public domain code from [2] (released in 2016).
[1] https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/fast-crc-computation-generic-polynomials-pclmulqdq-paper.pdf
[2] https://github.com/rawrunprotected/crc
This uses it for CRC table initializations when using --disable-small.
It avoids mythread_once() overhead. It also means that then
--disable-small --disable-threads is thread-safe if this attribute
is supported.
That is, the Filter ID will be changed once the design is final.
The current version will be removed. So files created with the
tempoary Filter ID won't be supported in the future.
The previous commit split liblzma.map into liblzma_linux.map and
liblzma_generic.map. This commit updates the CMake build for those.
common_w32res.rc dependency was listed under Linux/FreeBSD while
obviously it belongs to Windows when building a DLL.
These are a minor thing especially since the xz build has
some real problems still like lack of large file support
on 32-bit systems but I'll commit this since the code exists.
Thanks to Jia Tan.
The naming conflict with FindLibLZMA module gets worse.
Not avoiding it in the first place was stupid.
Normally find_package(LibLZMA) will use the module and
find_package(liblzma 5.2.5 REQUIRED CONFIG) will use the config
file even with a case insensitive file system. However, if
CMAKE_FIND_PACKAGE_PREFER_CONFIG is TRUE and the file system
is case insensitive, find_package(LibLZMA) will find our liblzma
config file instead of using FindLibLZMA module.
One big problem with this is that FindLibLZMA uses
LibLZMA::LibLZMA and we use liblzma::liblzma as the target
name. With target names CMake happens to be case sensitive.
To workaround this, this commit adds
add_library(LibLZMA::LibLZMA ALIAS liblzma::liblzma)
to the config file. Then both spellings work.
To make the behavior consistent between case sensitive and
insensitive file systems, the config and related files are
renamed from liblzmaConfig.cmake to liblzma-config.cmake style.
With this style CMake looks for lowercase version of the package
name so find_package(LiBLzmA 5.2.5 REQUIRED CONFIG) will work
to find our config file.
There are other differences between our config file and
FindLibLZMA so it's still possible that things break for
reasons other than the spelling of the target name. Hopefully
those situations aren't too common.
When the config file is available, it should always give as good or
better results as FindLibLZMA so this commit doesn't affect the
recommendation to use find_package(liblzma 5.2.5 REQUIRED CONFIG)
which explicitly avoids FindLibLZMA.
Thanks to Markus Rickert.
The syntax "if(DEFINED CACHE{FOO})" requires CMake 3.14.
In some other places the code treats the cache variables
like normal variables already (${FOO} or if(FOO) is used,
not ${CACHE{FOO}).
Thanks to ygrek for reporting the bug on IRC.
Seems that the phrase "add more quotes" from sh/bash scripting
applies to CMake as well. E.g. passing an unquoted list ${FOO}
to a function that expects one argument results in only the
first element of the list being passed as an argument and
the rest get ignored. Adding quotes helps ("${FOO}").
list(INSERT ...) is weird. Inserting an empty string to an empty
variable results in empty list, but inserting it to a non-empty
variable does insert an empty element to the list.
Since INSERT requires at least one element,
"${CMAKE_THREAD_LIBS_INIT}" needs to be quoted in CMakeLists.txt.
It might result in an empty element in the list. It seems to not
matter as empty elements consistently get ignored in that variable.
In fact, calling cmake_check_push_state() and cmake_check_pop_state()
will strip the empty elements from CMAKE_REQUIRED_LIBRARIES!
In addition to quoting fixes, this fixes checks for the cache
variables in tuklib_cpucores.cmake and tuklib_physmem.cmake.
Thanks to Martin Matuška for testing and reporting the problems.
These fixes aren't tested yet but hopefully they soon will be.
This does *NOT* replace the Autotools-based build system in
the foreseeable future. See the comment in the beginning
of CMakeLists.txt.
So far this has been tested only on GNU/Linux but I commit
it anyway to make it easier for others to test. Since I
haven't played much with CMake before, it's likely that
there are things that have been done in a silly or wrong
way and need to be fixed.