Sometimes the version number from "less -V" contains a dot,
sometimes not. xzless failed detect the version number when
it does contain a dot. This fixes it.
Thanks to nick87720z for reporting this. Apparently it had been
reported here <https://bugs.gentoo.org/489362> in 2013.
Due to architectural limitations, address space available to a single
userspace process on MIPS32 is limited to 2 GiB, not 4, even on systems
that have more physical RAM -- e.g. 64-bit systems with 32-bit
userspace, or systems that use XPA (an extension similar to x86's PAE).
So, for MIPS32, we have to impose stronger memory limits. I've chosen
2000MiB to give the process some headroom.
The naming conflict with FindLibLZMA module gets worse.
Not avoiding it in the first place was stupid.
Normally find_package(LibLZMA) will use the module and
find_package(liblzma 5.2.5 REQUIRED CONFIG) will use the config
file even with a case insensitive file system. However, if
CMAKE_FIND_PACKAGE_PREFER_CONFIG is TRUE and the file system
is case insensitive, find_package(LibLZMA) will find our liblzma
config file instead of using FindLibLZMA module.
One big problem with this is that FindLibLZMA uses
LibLZMA::LibLZMA and we use liblzma::liblzma as the target
name. With target names CMake happens to be case sensitive.
To workaround this, this commit adds
add_library(LibLZMA::LibLZMA ALIAS liblzma::liblzma)
to the config file. Then both spellings work.
To make the behavior consistent between case sensitive and
insensitive file systems, the config and related files are
renamed from liblzmaConfig.cmake to liblzma-config.cmake style.
With this style CMake looks for lowercase version of the package
name so find_package(LiBLzmA 5.2.5 REQUIRED CONFIG) will work
to find our config file.
There are other differences between our config file and
FindLibLZMA so it's still possible that things break for
reasons other than the spelling of the target name. Hopefully
those situations aren't too common.
When the config file is available, it should always give as good or
better results as FindLibLZMA so this commit doesn't affect the
recommendation to use find_package(liblzma 5.2.5 REQUIRED CONFIG)
which explicitly avoids FindLibLZMA.
Thanks to Markus Rickert.
When the uncompressed size is known to be exact, after decompressing
the stream exactly comp_size bytes of input must have been consumed.
This is a minor improvement to error detection.
The caller must still not specify an uncompressed size bigger
than the actual uncompressed size.
As a downside, this now needs the exact compressed size.
Right now this is just a planned extra-compact format for use
in the EROFS file system in Linux. At this point it's possible
that the format will either change or be abandoned and removed
completely.
The special thing about the encoder is that it uses the
output-size-limited encoding added in the previous commit.
EROFS uses fixed-sized blocks (e.g. 4 KiB) to hold compressed
data so the compressors must be able to create valid streams
that fill the given block size.
With this it is possible to encode LZMA1 data without EOPM so that
the encoder will encode as much input as it can without exceeding
the specified output size limit. The resulting LZMA1 stream will
be a normal LZMA1 stream without EOPM. The actual uncompressed size
will be available to the caller via the uncomp_size pointer.
One missing thing is that the LZMA layer doesn't inform the LZ layer
when the encoding is finished and thus the LZ may read more input
when it won't be used. However, this doesn't matter if encoding is
done with a single call (which is the planned use case for now).
For proper multi-call encoding this should be improved.
This commit only adds the functionality for internal use.
Nothing uses it yet.
Previously this required using --force but that has other
effects too which might be undesirable. Changing the behavior
of --keep has a small risk of breaking existing scripts but
since this is a fairly special corner case I expect the
likehood of breakage to be low enough.
I think the new behavior is more logical. The only reason for
the old behavior was to be consistent with gzip and bzip2.
Thanks to Vincent Lefevre and Sebastian Andrzej Siewior.
Omit the -q option from xz, gzip, and bzip2. With xz this shouldn't
matter. With gzip it's important because -q makes gzip replace SIGPIPE
with exit status 2. With bzip2 it's important because with -q bzip2
is completely silent if input is corrupt while other decompressors
still give an error message.
Avoiding exit status 2 from gzip is important because bzip2 uses
exit status 2 to indicate corrupt input. Before this commit xzgrep
didn't recognize corrupt .bz2 files because xzgrep was treating
exit status 2 as SIGPIPE for gzip compatibility.
zstd still needs -q because otherwise it is noisy in normal
operation.
The code to detect real SIGPIPE didn't check if the exit status
was due to a signal (>= 128) and so could ignore some other exit
status too.
This is a minor fix since this affects only the situation when
the files differ and the exit status is something else than 0.
In such case there could be SIGPIPE from a decompression tool
and that would result in exit status of 2 from xzdiff/xzcmp
while the correct behavior would be to return 1 or whatever
else diff or cmp may have returned.
This commit omits the -q option from xz/gzip/bzip2/lzop arguments.
I'm not sure why the -q was used in the first place, perhaps it
hides warnings in some situation that I cannot see at the moment.
Hopefully the removal won't introduce a new bug.
With gzip the -q option was harmful because it made gzip return 2
instead of >= 128 with SIGPIPE. Ignoring exit status 2 (warning
from gzip) isn't practical because bzip2 uses exit status 2 to
indicate corrupt input file. It's better if SIGPIPE results in
exit status >= 128.
With bzip2 the removal of -q seems to be good because with -q
it prints nothing if input is corrupt. The other tools aren't
silent in this situation even with -q. On the other hand, if
zstd support is added, it will need -q since otherwise it's
noisy in normal situations.
Thanks to Étienne Mollier and Sebastian Andrzej Siewior.
Before this commit all output queue buffers were allocated as
a single big allocation. Now each buffer is allocated separately
when needed. Used buffers are cached to avoid reallocation
overhead but the cache will keep only one buffer size at a time.
This should make things work OK in the decompression where most
of the time the buffer sizes will be the same but with some less
common files the buffer sizes may vary.
While this should work fine, it's still a bit preliminary
and may even get reverted if it turns out to be useless for
decompression.
When Intel CET is enabled, we need to include <cet.h> in assembly codes
to mark Intel CET support and add _CET_ENDBR to indirect jump targets.
Tested on Intel Tiger Lake under CET enabled Linux.
The syntax "if(DEFINED CACHE{FOO})" requires CMake 3.14.
In some other places the code treats the cache variables
like normal variables already (${FOO} or if(FOO) is used,
not ${CACHE{FOO}).
Thanks to ygrek for reporting the bug on IRC.
I don't want to use \c in macro arguments but groff_man(7)
suggests that \f has better portability. \f would be needed
for the .TP strings for portability reasons anyway.
Thanks to Bjarni Ingi Gislason.
A few are simply omitted, most are converted to "for example"
and surrounded with commas. Sounds like that this is better
style, for example, man-pages(7) recommends avoiding such
abbreviations except in parenthesis.
Thanks to Bjarni Ingi Gislason.
Docs of ancient troff/nroff mention \(em (em-dash) but not \(en
and \- was used for both minus and en-dash. I don't know how
portable \(en is nowadays but it can be changed back if someone
complains. At least GNU groff and OpenBSD's mandoc support it.
Thanks to Bjarni Ingi Gislason for the patch.
Output is from: test-groff -b -e -mandoc -T utf8 -rF0 -t -w w -z
[ "test-groff" is a developmental version of "groff" ]
Input file is ./src/scripts/xzgrep.1
<src/scripts/xzgrep.1>:20 (macro RB): only 1 argument, but more are expected
<src/scripts/xzgrep.1>:23 (macro RB): only 1 argument, but more are expected
<src/scripts/xzgrep.1>:26 (macro RB): only 1 argument, but more are expected
<src/scripts/xzgrep.1>:29 (macro RB): only 1 argument, but more are expected
<src/scripts/xzgrep.1>:32 (macro RB): only 1 argument, but more are expected
"abc..." does not mean the same as "abc ...".
The output from nroff and troff is unchanged except for the space
between "file" and "...".
Signed-off-by: Bjarni Ingi Gislason <bjarniig@rhi.hi.is>
Summary:
mandoc -T lint xzgrep.1 :
mandoc: xzgrep.1:79:2: WARNING: skipping paragraph macro: PP empty
There is no change in the output of "nroff" and "troff".
Signed-off-by: Bjarni Ingi Gislason <bjarniig@rhi.hi.is>
Output is from: test-groff -b -e -mandoc -T utf8 -rF0 -t -w w -z
[ "test-groff" is a developmental version of "groff" ]
Input file is ./src/xz/xz.1
<src/xz/xz.1>:408 (macro BR): only 1 argument, but more are expected
<src/xz/xz.1>:1009 (macro BR): only 1 argument, but more are expected
<src/xz/xz.1>:1743 (macro BR): only 1 argument, but more are expected
<src/xz/xz.1>:1920 (macro BR): only 1 argument, but more are expected
<src/xz/xz.1>:2213 (macro BR): only 1 argument, but more are expected
Output from nroff and troff is unchanged, except for a font change of a
full stop (.).
Signed-off-by: Bjarni Ingi Gislason <bjarniig@rhi.hi.is>