SSE2 is supported on every x86-64 processor. The SSE2 code is used on
32-bit x86 if compiler options permit unconditional use of SSE2.
dict_repeat() copies short random-sized unaligned buffers. At least
on glibc, FreeBSD, and Windows (MSYS2, UCRT, MSVCRT), memcpy() is
clearly faster than byte-by-byte copying in this use case. Compared
to the memcpy() version, the new SSE2 version reduces decompression
time by 0-5 % depending on the machine and libc. It should never be
slower than the memcpy() version.
However, on musl 1.2.5 on x86-64, the memcpy() version is the slowest.
Compared to the memcpy() version:
- The byte-by-version takes 6-7 % less time to decompress.
- The SSE2 version takes 16-18 % less time to decompress.
The numbers are from decompressing a Linux kernel source tarball in
single-threaded mode on older AMD and Intel systems. The tarball
compresses well, and thus dict_repeat() performance matters more
than with some other files.
This doesn't make any difference in practice because compilers can
already see that writing through the dict->buf pointer cannot modify
the contents of *dict itself: The LZMA decoder makes a local copy of
the lzma_dict structure, and even if it didn't, the pointer to
lzma_dict in the LZMA decoder is already "restrict".
It's nice to add "restrict" anyway. uint8_t is typically unsigned char
which can alias anything. Without the above conditions or "restrict",
compilers could need to assume that writing through dict->buf might
modify *dict. This would matter in dict_repeat() because the loops
refer to dict->buf and dict->pos instead of making local copies of
those members for the duration of the loops. If compilers had to
assume that writing through dict->buf can affect *dict, then compilers
would need to emit code that reloads dict->buf and dict->pos after
every write through dict->buf.
Revert back to a macro so that list(APPEND CMAKE_REQUIRED_DEFINITIONS)
will affect the calling scope. I had forgotten that while CMake
functions inherit the variables from the parent scope, the changes
to them are local unless using set(... PARENT_SCOPE).
This also means that the commit message in 5bb77d0920dc is wrong. The
commit itself is still fine, making it clearer that -DHAVE_SYS_PARAM_H
is only needed for specific check_c_source_compiles() calls.
Fixes: c1ea7bd0b60eed6ebcdf9a713ca69034f6f07179
Define NetBSD and Darwin/macOS feature test macros. Autoconf defines
these too (and a few others).
Define the macros on Windows except with MSVC. The _GNU_SOURCE macro
makes a difference with mingw-w64.
Use a function instead of a macro. Don't take the TARGET_OR_ALL argument
because there's always global effect because the global variable
CMAKE_REQUIRED_DEFINITIONS is modified.
All translators know that --command-line-options must not be translated.
With some other strings it's not obvious when the untranslated string
must be preserved. These comments hopefully help.
The deprecated aliases are lzcmp, lzdiff, lzless, lzmore,
lzgrep, lzegrep, and lzfgrep. The commands that start with
the xz prefix have identical behavior, for example, both
lzgrep and xzgrep handle all supported file formats.
This doesn't affect lzma, unlzma, lzcat, lzmadec, or lzmainfo.
The last release of LZMA Utils was made in 2008, but the lzma
compatibility alias for the gzip-like tool is still in common use.
Deprecating it would cause unnecessary breakage.
Nowaways $(top_builddir)/lib/getopt.h depends on headers in
$(top_srcdir)/lib, so both have to be in the include path.
CMake-based build already did this.
Fixes: 7e884c00d0093c38339f17fb1d280eec493f42ca
Now one can pass gl_replace_getopt=yes to configure to force the use
of GNU getopt_long from the lib directory. This only checks that the
value of gl_replace_getopt is non-empty, so one cannot force the
replacement to be disabled.
Closes: https://github.com/tukaani-project/xz/pull/166
Make it clearer that translations cannot be accepted if they don't
come via the Translation Project.
Column headings have been handled automatically for years and now --help
is autowrapped too, so the related instructions can be removed.
Make it work for out-of-tree builds without requiring one to specify
the location of the xz executable.
Add xz --filters-help.
Make the output shorter by reducing the number of xz -lvv test files.
Show the value of LANGUAGE environment variable.
Show the xz.git version using git describe --abbrev=8 instead of =4.