The macro lzma_attr_visibility_hidden has to be defined to make
fastpos.h usable. The visibility attribute is irrelevant to
fastpos_tablegen.c so simply #define the macro to an empty value.
fastpos_tablegen.c is never built by the included build systems
and so the problem wasn't noticed earlier. It's just a standalone
program for generating fastpos_table.c.
Fixes: https://github.com/tukaani-project/xz/pull/69
Thanks to GitHub user Jamaika1.
In ELF shared libs:
-fvisibility=hidden affects definitions of symbols but not
declarations.[*] This doesn't affect direct calls to functions
inside liblzma as a linker can replace a call to lzma_foo@plt
with a call directly to lzma_foo when -fvisibility=hidden is used.
[*] It has to be like this because otherwise every installed
header file would need to explictly set the symbol visibility
to default.
When accessing extern variables that aren't defined in the
same translation unit, compiler assumes that the variable has
the default visibility and thus indirection is needed. Unlike
function calls, linker cannot optimize this.
Using __attribute__((__visibility__("hidden"))) with the extern
variable declarations tells the compiler that indirection isn't
needed because the definition is in the same shared library.
About 15+ years ago, someone told me that it would be good if
the CRC tables would be defined in the same translation unit
as the C code of the CRC functions. While I understood that it
could help a tiny amount, I didn't want to change the code because
a separate translation unit for the CRC tables was needed for the
x86 assembly code anyway. But when visibility attributes are
supported, simply marking the extern declaration with the
hidden attribute will get identical result. When there are only
a few affected variables, this is trivial to do. I wish I had
understood this back then already.
MinGW (formely a MinGW.org Project, later the MinGW.OSDN Project
at <https://osdn.net/projects/mingw/>) has GCC 9.2.0 as the
most recent GCC package (released 2021-02-02). The project might
still be alive but majority of people have switched to MinGW-w64.
Thus it seems clearer to refer to MinGW-w64 in our API headers too.
Building with MinGW is likely to still work but I haven't tested it
in the recent years.
A CMake option LARGE_FILE_SUPPORT is created if and only if
-D_FILE_OFFSET_BITS=64 affects sizeof(off_t).
This is needed on many 32-bit platforms and even with 64-bit builds
with MinGW-w64 to get support for files larger than 2 GiB.
Autotools based build uses -pthread and thus adds it to Libs.private
in liblzma.pc. CMake doesn't use -pthread at all if pthread functions
are available in libc so Libs.private doesn't get -pthread either.
It properly adds -DLZMA_API_STATIC when compiling code that
will be linked against static liblzma. Having it there on
systems other than Windows does no harm.
See: https://www.msys2.org/docs/pkgconfig/
Now configure will fail if -fsanitize= is found in CFLAGS
and sanitizer-incompatible ifunc or Landlock sandboxing
would be used. These are incompatible with one or more sanitizers.
It's simpler to reject all -fsanitize= uses instead of trying to
pass those that might not cause problems.
CMake-based build was updated similarly. It lets the configuration
finish (SEND_ERROR instead of FATAL_ERROR) so that both error
messages can be seen at once.
The sandboxing on Linux now supports Landlock, which restricts all
supported filesystem actions after xz opens the files it needs. The
sandbox is only enabled when one file is input and we are writing to
standard out. With fsanitize=address,undefined, the instrumentation
needs to read additional files after the sandbox is in place. This
forces all xz based test to fail, so the sandbox must instead be
disabled.
Using set(ENABLE_THREADS "posix") is confusing because it sets
a new normal variable and leaves the cache entry with the same
name unchanged. The intent wasn't to change the cache entry so
this switches to a different variable name.
This way typos are caught quickly and compounding error messages
are avoided (a single typo could cause more than one error).
This keeps using SEND_ERROR when the system is lacking a feature
(like threading library or sandboxing method). This way the whole
configuration log will be generated in case someone wishes to
report a problem upstream.
This removes support for FreeBSD 10.0 and 10.1 which used
<sys/capability.h> instead of <sys/capsicum.h>. Support for
FreeBSD 10.1 ended on 2016-12-31. So now FreeBSD >= 10.2 is
required to enable Capsicum support.
This also removes support for Capsicum on Linux (libcaprights)
which seems to have been unmaintained since 2017 and Linux 4.11:
https://github.com/google/capsicum-linux
See the new comment in the code.
This also makes the check for clock_gettime() run with MinGW-w64
with which we don't want to use clock_gettime(). The previous
commit already took care of this situation.
This commit alone doesn't change anything in the real-world:
- configure.ac currently checks for clock_gettime() only
when using pthreads.
- CMakeLists.txt doesn't check for clock_gettime() on Windows.
So clock_gettime() wasn't used with MinGW-w64 before either.
clock_gettime() provides monotonic time and it's better than
gettimeofday() in this sense. But clock_gettime() is defined
in winpthreads, and liblzma or xz needs nothing else from
winpthreads. By avoiding clock_gettime(), we avoid the dependency on
libwinpthread-1.dll or the need to link against the static version.
As a bonus, GetTickCount64() and MinGW-w64's gettimeofday() can be
faster than clock_gettime(CLOCK_MONOTONIC, &tv). The resolution
is more than good enough for the progress indicator in xz.
This partially reverts creating crc_clmul.c
(8c0f9376f5) where is_clmul_supported()
was moved, extern'ed, and renamed to lzma_is_clmul_supported(). This
caused a problem when the function call to lzma_is_clmul_supported()
results in a call through the PLT. ifunc resolvers run very early in
the dynamic loading sequence, so the PLT may not be setup properly at
this point. Whether the PLT is used or not for
lzma_is_clmul_supported() depened upon the compiler-toolchain used and
flags.
In liblzma compiled with GCC, for instance, GCC will go through the PLT
for function calls internal to liblzma if the version scripts and
symbol visibility hiding are not used. If lazy-binding is disabled,
then it would have made any program linked with liblzma fail during
dynamic loading in the ifunc resolver.
Currently crc32 is always enabled, so COND_CHECK_CRC32 must always be
set. Because of this, it makes the recent change to conditionally
compile check/crc_clmul.c appear wrong since that file has CLMUL
implementations for both CRC32 and CRC64.
The option is enabled by default, but will only be visible to a user
listing cache variables or using a CMake GUI application if the
immintrin.h header file is found.
This mirrors our Autotools build --disable-clmul-crc functionality.
After forcing crc_simd_body() to always be inlined it caused
-fsanitize=address to fail for lzma_crc32_clmul() and
lzma_crc64_clmul(). The __no_sanitize_address__ attribute was added
to lzma_crc32_clmul() and lzma_crc64_clmul(), but not removed from
crc_simd_body(). ASAN and inline functions behavior has changed over
the years for GCC specifically, so while strictly required we will
keep __attribute__((__no_sanitize_address__)) on crc_simd_body() in
case this becomes a requirement in the future.
Older GCC versions refuse to inline a function with ASAN if the
caller and callee do not agree on sanitization flags
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89124#c3). If the
function was forced to be inlined, it will not compile if the callee
function has __no_sanitize_address__ but the caller doesn't.
PowerPC64LE wasn't tested but it seems like a safe change.
POWER8 supports unaligned access in little endian mode. Testing
on godbolt.org shows that GCC uses unaligned access by default.
The RISC-V macro __riscv_misaligned_fast is very new and not
in any stable compiler release yet.
Documentation in INSTALL was updated to match.
Documentation about an autodetection bug when using ARM64 GCC
with -mstrict-align was added to INSTALL.
CMake files weren't updated yet.
In XZ Utils context this doesn't matter much because
unaligned reads and writes aren't used in hot code
when TUKLIB_FAST_UNALIGNED_ACCESS isn't #defined.
After testing a 32-bit Release build on MSVC, only lzma_crc64_clmul()
has the bug. crc_simd_body() and lzma_crc32_clmul() do not need the
optimizations disabled.
Forcing this to be inline has a significant speed improvement at the
cost of a few repeated instructions. The compilers tested on did not
inline this function since it is large and is used twice in the same
translation unit.
This macro must be used instead of the inline keyword. On MSVC, it is
a replacement for __forceinline which is an MSVC specific keyword that
should not be used with inline (it will issue a warning if it is).
It does not use a build system check to determine if
__attribute__((__always_inline__)) since all compilers that can use
CLMUL extensions (except the special case for MSVC) should support this
attribute. If this assumption is incorrect then it will result in a bug
report instead of silently producing slow code.
A detailed description of the three dispatch methods was added. Also,
duplicated comments now only appear in crc32_fast.c or were removed from
both crc32_fast.c and crc64_fast.c if they appeared in crc_clmul.c.
Both crc32_clmul() and crc64_clmul() are now exported from
crc32_clmul.c as lzma_crc32_clmul() and lzma_crc64_clmul(). This
ensures that is_clmul_supported() (now lzma_is_clmul_supported()) is
not duplicated between crc32_fast.c and crc64_fast.c.
Also, it encapsulates the complexity of the CLMUL implementations into a
single file and reduces the complexity of crc32_fast.c and crc64_fast.c.
Before, CLMUL code was present in crc32_fast.c, crc64_fast.c, and
crc_common.h.
During the conversion, various cleanups were applied to code (thanks to
Lasse Collin) including:
- Require using semicolons with MASK_/L/H/LH macros.
- Variable typing and const handling improvements.
- Improvements to comments.
- Fixes to the pragmas used.
- Removed unneeded variables.
- Whitespace improvements.
- Fixed CRC_USE_GENERIC_FOR_SMALL_INPUTS handling.
- Silenced warnings and removed the need for some #pragmas