Now extra buffer space is reserved so that repeating bytes for
any single match will never need to copy from two places (both
the beginning and the end of the buffer). This simplifies
dict_repeat() and helps a little with speed.
This seems to reduce .lzma decompression time about 2 %, so
with .xz and CRC it could be slightly less. The small things
add up still.
It's not completely obvious if this is better in the decoder.
It should be good if compiler can avoid creating a branch
(like using CMOV on x86).
This also makes lzma_encoder.c use the new macros.
The new decoder resumes the first decoder loop in the Resumable mode.
Then, the code executes in Non-resumable mode until it detects that it
cannot guarantee to have enough input/output to decode another symbol.
The Resumable mode is how the decoder has always worked. Before decoding
every input bit, it checks if there is enough space and will save its
location to be resumed later. When the decoder has more input/output,
it jumps back to the correct sequence in the Resumable mode code.
When the input/output buffers are large, the Resumable mode is much
slower than the Non-resumable because it has more branches and is harder
for the compiler to optimize since it is in a large switch block.
Early benchmarking shows significant time improvement (8-10% on gcc and
clang x86) by using the Non-resumable code as much as possible.
The new "safe" range decoder mode is the same as old range decoder, but
now the default behavior of the range decoder will not check if there is
enough input or output to complete the operation. When the buffers are
close to fully consumed, the "safe" operations must be used instead. This
will improve speed because it will reduce the number of branches needed
for most of the range decoder operations.
Perhaps the generated files aren't even copyrightable but
using the same license for them as for the rest of the liblzma
keeps things more consistent for tools that look for license info.
lzma_encoder_init() did not check for NULL options, but
lzma2_encoder_init() did. This is more of a code style improvement than
anything else to help make lzma_encoder_init() and lzma2_encoder_init()
more similar.
The macro lzma_attr_visibility_hidden has to be defined to make
fastpos.h usable. The visibility attribute is irrelevant to
fastpos_tablegen.c so simply #define the macro to an empty value.
fastpos_tablegen.c is never built by the included build systems
and so the problem wasn't noticed earlier. It's just a standalone
program for generating fastpos_table.c.
Fixes: https://github.com/tukaani-project/xz/pull/69
Thanks to GitHub user Jamaika1.
The lzma_mt_block_size() was previously just an internal function for
the multithreaded .xz encoder. It is used to provide a recommended Block
size for a given filter chain.
This function is helpful to determine the maximum Block size for the
multithreaded .xz encoder when one wants to change the filters between
blocks. Then, this determined Block size can be provided to
lzma_stream_encoder_mt() in the lzma_mt options parameter when
intializing the coder. This requires one to know all the filter chains
they are using before starting to encode (or at least the filter chain
that will need the largest Block size), but that isn't a bad limitation.
Some file formats need support for LZMA1 streams that don't use
the end of payload marker (EOPM) alias end of stream (EOS) marker.
So far liblzma API has supported decompressing such streams via
lzma_alone_decoder() when .lzma header specifies a known
uncompressed size. Encoding support hasn't been available in the API.
Instead of adding a new LZMA1-only API for this purpose, this commit
adds a new filter ID for use with raw encoder and decoder. The main
benefit of this approach is that then also filter chains are possible,
for example, if someone wants to implement support for .7z files that
use the x86 BCJ filter with LZMA1 (not BCJ2 as that isn't supported
in liblzma).
That is, if the specified nice_len is smaller than the minimum
of the match finder, silently use the match finder's minimum value
instead of reporting an error. The old behavior is annoying to users
and it complicates xz options handling too.
The encoder doesn't support dictionary sizes larger than 1536 MiB.
This is validated, for example, when calculating the memory usage
via lzma_raw_encoder_memusage(). It is also enforced by the LZ
part of the encoder initialization. However, LZMA encoder with
LZMA_MODE_NORMAL did an unsafe calculation with dict_size before
such validation and that results in an infinite loop if dict_size
was 2 << 30 or greater.
Turns out that this is needed for .lzma files as the spec in
LZMA SDK says that end marker may be present even if the size
is stored in the header. Such files are rare but exist in the
real world. The code in liblzma is so old that the spec didn't
exist in LZMA SDK back then and I had understood that such
files weren't possible (the lzma tool in LZMA SDK didn't
create such files).
This modifies the internal API so that LZMA decoder can be told
if EOPM is allowed even when the uncompressed size is known.
It's allowed with .lzma and not with other uses.
Thanks to Karl Beldan for reporting the problem.
Previously lzma_lzma_props_encode() and lzma_lzma2_props_encode()
assumed that the options pointers must be non-NULL because the
with these filters the API says it must never be NULL. It is
good to do these checks anyway.
With this it is possible to encode LZMA1 data without EOPM so that
the encoder will encode as much input as it can without exceeding
the specified output size limit. The resulting LZMA1 stream will
be a normal LZMA1 stream without EOPM. The actual uncompressed size
will be available to the caller via the uncomp_size pointer.
One missing thing is that the LZMA layer doesn't inform the LZ layer
when the encoding is finished and thus the LZ may read more input
when it won't be used. However, this doesn't matter if encoding is
done with a single call (which is the planned use case for now).
For proper multi-call encoding this should be improved.
This commit only adds the functionality for internal use.
Nothing uses it yet.
Only one definition was visible in a translation unit.
It avoided a few casts and temp variables but seems that
this hack doesn't work with link-time optimizations in compilers
as it's not C99/C11 compliant.
Fixes:
http://www.mail-archive.com/xz-devel@tukaani.org/msg00279.html
People shouldn't rely on the presets when decoding raw streams,
but xz uses the presets as the starting point for raw decoder
options anyway.
lzma_encocder_presets.c was renamed to lzma_presets.c to
make it clear it's not used solely by the encoder code.
There is a tiny risk of causing breakage: If an application
assigns lzma_stream.allocator to a non-const pointer, such
code won't compile anymore. I don't know why anyone would do
such a thing though, so in practice this shouldn't cause trouble.
Thanks to Jan Kratochvil for the patch.
Spot candidates by running these commands:
git ls-files |xargs perl -0777 -n \
-e 'while (/\b(then?|[iao]n|i[fst]|but|f?or|at|and|[dt]o)\s+\1\b/gims)' \
-e '{$n=($` =~ tr/\n/\n/ + 1); ($v=$&)=~s/\n/\\n/g; print "$ARGV:$n:$v\n"}'
Thanks to Jim Meyering for the original patch.
The decoder considered empty LZMA2 streams to be corrupt.
This shouldn't matter much with .xz files, because no encoder
creates empty LZMA2 streams in .xz. This bug is more likely
to cause problems in applications that use raw LZMA2 streams.
This has no semantic changes. I find the new names slightly
more logical and they match the names that are already used
in XZ Embedded.
The name fastpos wasn't changed (not worth the hassle).
This should reduce the cases where --extreme makes
compression worse. On the other hand, some other
files may now benefit slightly less from --extreme.