This only affects builds with UCRT. With legacy MSVCRT, the replacement
functions are always enabled.
Omitting the MinGW-w64 replacements saves over 20 KiB per executable.
The downside is that --enable-small or XZ_SMALL=ON disables thousand
separator support in xz messages. If someone is OK with the slower
speed of slightly smaller builds, lack of thousand separators won't
matter.
Don't override __USE_MINGW_ANSI_STDIO if it is already defined (via
CPPFLAGS or such method).
Previously it was enabled only on x86-64 and ARM64 when also support
for unaligned access was detected or manually enabled at built time.
In the default build configuration, the 8-byte method is now enabled
also on 64-bit RISC-V and 64-bit PowerPC (both endiannesses). It was
reported that on big endian POWER9, encoding time may reduce 12-13 %.
This change only affects builds with GCC and Clang because the code
uses __builtin_ctzll or __builtin_clzll.
Thanks to Marcus Comstedt for testing on POWER9.
When the 8-byte method was enabled for ARM64, a check for endianness
wasn't added. This broke the LZMA/LZMA2 encoder. Test suite caught it.
Fixes: cd64dd70d5
Co-authored-by: Marcus Comstedt <marcus@mc.pp.se>
I rewrapped a few overlong lines. Those edits aren't in the
Translation Project. Automatic wrapping in the master branch
means that these strings need to be updated soon anyway.
Testing with musl 1.2.5 and Linux 6.12, O_SEARCH doesn't result
in a file descriptor that works with fsync() although it should work.
See the added comment.
The same issue affected gzip --synchronous:
https://bugs.gnu.org/75405
Thanks to Paul Eggert.
Opening a directory with O_SEARCH results in a file descriptor that can
be used with functions like openat(). Such a file descriptor cannot be
used with fsync(). Use O_RDONLY instead.
In musl, O_SEARCH becomes Linux-specific O_PATH. A file descriptor
from O_PATH doesn't allow fsync().
Seems that it's not possible to fsync() a directory that has write
and search permissions but not read permission.
Fixes: 2a9e91d796
A typical use case is like this:
printf("%s: %s\n", tuklib_mask_nonprint(filename), strerror(errno));
tuklib_mask_nonprint() may call mbrtowc() and malloc() which may modify
errno. If errno isn't preserved, the error message might be wrong if
a compiler decides to call tuklib_mask_nonprint() before strerror().
Fixes: 40e5733055
xz's default behavior is to delete the input file after successful
compression or decompression (unless writing to standard output).
If the system crashes soon after the deletion, it is possible that
the newly written file has not yet hit the disk while the previous
delete operation might have. In that case neither the original file
nor the written file is available.
Call fsync() on the file. On POSIX systems, sync also the directory
where the file was created.
Add a new option --no-sync which disables fsync() usage. It can avoid
a (possibly significant) performance penalty when processing many
small files. It's fine to use --no-sync when one knows that the files
are easy to recreate or restore after a system crash.
Using fsync() after every flush initiated by --flush-timeout was
considered. It wasn't implemented at least for now.
- --flush-timeout is typically used when writing to stdout. If stdout
is a file, xz cannot (portably) sync the directory of the file.
One would need to create the output file first, sync the directory,
and then run xz with fsync() enabled.
- If xz --flush-timeout output goes to a file, it's possible to use
a separate script to sync the file, for example, once per minute
while telling xz to flush more frequently.
- Not supporting syncing with --flush-timeout was simpler.
Portability notes:
- On systems that lack O_SEARCH (like Linux), "xz dir/file" will now
fail if "dir" cannot be opened for reading. If "dir" still has
write and search permissions (like d-wx------ in "ls -l"),
previously xz would have been able to compress "dir/file" still.
Now it only works if using --no-sync (or --keep or --stdout).
- <libgen.h> and dirname() should be available on all POSIX systems,
and aren't needed on non-POSIX systems.
- fsync() is available on all POSIX systems. The directory syncing
could be changed to fdatasync() although at least on ext4 it
doesn't seem to make a performance difference in xz's usage.
fdatasync() would need a build system check to support (old)
special cases, for example, MINIX 3.3.0 doesn't have fdatasync()
and Solaris 10 needs -lrt.
- On native Windows, _commit() is used to replace fsync(). Directory
syncing isn't done and shouldn't be needed. (In Cygwin, fsync() on
directories is a no-op.)
- DJGPP has fsync() for files. ;-)
Using fsync() was considered somewhere around 2009 and again in 2016 but
those times the idea was rejected. For comparison, GNU gzip 1.7 (2016)
added the option --synchronous which enables fsync().
Co-authored-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Fixes: https://bugs.debian.org/814089
Link: https://www.mail-archive.com/xz-devel@tukaani.org/msg00282.html
Closes: https://github.com/tukaani-project/xz/pull/151
lzma_str_to_filters() may call parse_lzma12_preset() in two ways. The
call from str_to_filters() detects the string type from the first
character(s) and as a side-effect it validates the first digit of
the preset string. So this change makes no difference there.
However, the call from parse_options() doesn't pre-validate the string.
parse_lzma12_preset() will return an invalid value which is passed to
lzma_lzma_preset() which safely rejects it. The bug still affects the
the error message:
$ xz --filters=lzma2:preset=X
xz: Error in --filters=FILTERS option:
xz: lzma2:preset=X
xz: ^
xz: Unsupported preset
After the fix:
$ xz --filters=lzma2:preset=X
xz: Error in --filters=FILTERS option:
xz: lzma2:preset=X
xz: ^
xz: Unsupported preset
The ^ now correctly points to the X and not past it because the X itself
is the problematic character.
Fixes: cedeeca2ea
Forgetting the argument (or not using = to separate the option from
the argument) resulted in lzma_str_to_filters() being called with NULL
as input string argument. The function handles it fine but xz passes
the NULL to printf() too:
$ xz --filters
xz: Error in --filters=FILTERS option:
xz: (null)
xz: ^
xz: Unexpected NULL pointer argument(s) to lzma_str_to_filters()
Now it's correct:
$ xz --filters
xz: option '--filters' requires an argument
The --filters-help option doesn't take any arguments.
Fixes: 9ded880a02
Fixes: d6af7f3470
Fixes: a165d7df19
It's a POSIX feature that isn't in standard C. It's not available on
Windows. Even MinGW-w64 with __USE_MINGW_ANSI_STDIO doesn't support
it even though it supports POSIX %'d for thousand separators.
Gettext's <libintl.h> provides overrides for printf and other functions
which do support the %2$s formats. Translations use them. But xz should
work on Windows without <libintl.h> too.
Fixes: 3e9177fd20
A slightly silly thing is that xz may now query the ABI version up to
three times. We could call my_landlock_ruleset_attr_forbid_all() only
once and cache the result but it didn't seem worth doing.
Now that we have the FALLTHROUGH macro, use the strictest mode with
GCC so that comment-based fallthrough markings are no longer accepted.
In GCC, -Wextra includes -Wimplicit-fallthrough=3 and
-Wimplicit-fallthrough is the same as -Wimplicit-fallthrough=3.
Thus, the strict mode requires specifying -Wimplicit-fallthrough=5.
Clang has -Wimplicit-fallthrough which is *not* enabled by -Wextra.
Clang doesn't have a variant that takes an argument. Thus we need
to check for -Wimplicit-fallthrough. Do it before checking for
-Wimplicit-fallthrough=5 so that the latter overrides the former
when using GCC.
Also remove the recently-added workaround from tuklib_gettext.h.
Requiring a new enough gettext-runtime is cleaner. I guess it's
mostly MSYS2 where xz is built with translation support, so once
MSYS2 has Gettext >= 0.23.1, this requirement shouldn't be a problem
in practice.
The DESCRIPTION section always explained it, and the OPTIONS section
only described the differences to the default behavior. However, new
users in a hurry may skip reading DESCRIPTION. The default behavior
is a bit dangerous, thus it's good to repeat in --compress and
--decompress docs that source file is removed after successful operation.
Fixes: https://github.com/tukaani-project/xz/issues/150