lzma_index_dup() calls index_dup_stream() which, in case of
an error, calls index_stream_end() to free memory allocated
by index_stream_init(). However, it illogically didn't
actually free the memory. To make it logical, the tree
handling code was modified a bit in addition to changing
index_stream_end().
Thanks to Evan Nemerson for the bug report.
This way an invalid filter chain is detected at the Stream
encoder initialization instead of delaying it to the first
call to lzma_code() which triggers the initialization of
the actual filter encoder(s).
Note that this slightly changes how lzma_block_header_decode()
has been documented. Earlier it said that the .version is set
to the lowest required value, but now it says that the .version
field is kept unchanged if possible. In practice this doesn't
affect any old code, because before this commit the only
possible .version was 0.
This commit just adds the function. Its uses will be in
separate commits.
This hasn't been tested much yet and it's perhaps a bit early
to commit it but if there are bugs they should get found quite
quickly.
Thanks to Jun I Jin from Intel for help and for pointing out
that string comparison needs to be optimized in liblzma.
Now --block-list=SIZES works with in the threaded mode too,
although the performance is still bad due to the use of
LZMA_FULL_FLUSH instead of the new LZMA_FULL_BARRIER.
Now liblzma only uses "mythread" functions and types
which are defined in mythread.h matching the desired
threading method.
Before Windows Vista, there is no direct equivalent to
pthread condition variables. Since this package doesn't
use pthread_cond_broadcast(), pre-Vista threading can
still be kept quite simple. The pre-Vista code doesn't
use anything that wasn't already available in Windows 95,
so the binaries should run even on Windows 95 if someone
happens to care.
To avoid false positives when detecting .lzma files,
rare values in dictionary size and uncompressed size fields
were rejected. They will still be rejected if .lzma files
are decoded with lzma_auto_decoder(), but when using
lzma_alone_decoder() directly, such files will now be accepted.
Hopefully this is an OK compromise.
This doesn't affect xz because xz still has its own file
format detection code. This does affect lzmadec though.
So after this commit lzmadec will accept files that xz or
xz-emulating-lzma doesn't.
NOTE: lzma_alone_decoder() still won't decode all .lzma files
because liblzma's LZMA decoder doesn't support lc + lp > 4.
Reported here:
http://sourceforge.net/projects/lzmautils/forums/forum/708858/topic/7068827
This race condition could cause a deadlock if lzma_end() was
called before finishing the encoding. This can happen with
xz with debugging enabled (non-debugging version doesn't
call lzma_end() before exiting).
This adds lzma_get_progress() to liblzma and takes advantage
of it in xz.
lzma_get_progress() collects progress information from
the thread-specific structures so that fairly accurate
progress information is available to applications. Adding
a new function seemed to be a better way than making the
information directly available in lzma_stream (like total_in
and total_out are) because collecting the information requires
locking mutexes. It's waste of time to do it more often than
the up to date information is actually needed by an application.
There is a tiny risk of causing breakage: If an application
assigns lzma_stream.allocator to a non-const pointer, such
code won't compile anymore. I don't know why anyone would do
such a thing though, so in practice this shouldn't cause trouble.
Thanks to Jan Kratochvil for the patch.
Spot candidates by running these commands:
git ls-files |xargs perl -0777 -n \
-e 'while (/\b(then?|[iao]n|i[fst]|but|f?or|at|and|[dt]o)\s+\1\b/gims)' \
-e '{$n=($` =~ tr/\n/\n/ + 1); ($v=$&)=~s/\n/\\n/g; print "$ARGV:$n:$v\n"}'
Thanks to Jim Meyering for the original patch.
This is the simplest method to do threading, which splits
the uncompressed data into blocks and compresses them
independently from each other. There's room for improvement
especially to reduce the memory usage, but nevertheless,
this is a good start.
Empty Block was created if the input buffer was empty.
Empty Block wastes a few bytes of space, but more importantly
it triggers a bug in XZ Utils 5.0.1 and older when trying
to decompress such a file. 5.0.1 and older consider such
files to be corrupt. I thought that no encoder creates empty
Blocks when releasing 5.0.2 but I was wrong.
The biggest problem was that the integrity check type
wasn't validated, and e.g. lzma_easy_buffer_encode()
would create a corrupt .xz Stream if given an unsupported
Check ID. Luckily applications don't usually try to use
an unsupport Check ID, so this bug is unlikely to cause
many real-world problems.
It leaks old filter options structures (hundred bytes or so)
every time the lzma_stream is reinitialized. With the xz tool,
this happens when compressing multiple files.
If any of the reserved members in lzma_stream are non-zero
or non-NULL, LZMA_OPTIONS_ERROR is returned. It is possible
that a new feature in the future is indicated by just setting
a reserved member to some other value, so the old liblzma
version need to catch it as an unsupported feature.
With bad luck, lzma_code() could return LZMA_BUF_ERROR
when it shouldn't.
This has been here since the early days of liblzma.
It got triggered by the modifications made to the xz
tool in commit 18c10c30d2833f394cd7bce0e6a821044b15832f
but only when decompressing .lzma files. Somehow I managed
to miss testing that with Valgrind earlier.
This fixes <http://bugs.gentoo.org/show_bug.cgi?id=305591>.
Thanks to Rafał Mużyło for helping to debug it on IRC.