Commit Graph

695 Commits

Author SHA1 Message Date
Lasse Collin 0cd45fc2bc liblzma: Support LZMA_FULL_FLUSH and _BARRIER in threaded encoder.
Now --block-list=SIZES works with in the threaded mode too,
although the performance is still bad due to the use of
LZMA_FULL_FLUSH instead of the new LZMA_FULL_BARRIER.
2013-10-02 20:05:23 +03:00
Lasse Collin 97bb38712f liblzma: Add LZMA_FULL_BARRIER support to single-threaded encoder.
In the single-threaded encoder LZMA_FULL_BARRIER is simply
an alias for LZMA_FULL_FLUSH.
2013-10-02 12:55:11 +03:00
Lasse Collin fef0c6b410 liblzma: Add block_buffer_encoder.h into Makefile.inc.
This should have been in b465da5988.
2013-09-17 11:57:51 +03:00
Lasse Collin 8083e03291 xz: Add a missing test for TUKLIB_DOSLIKE. 2013-09-17 11:55:38 +03:00
Lasse Collin 6b44b4a775 Add native threading support on Windows.
Now liblzma only uses "mythread" functions and types
which are defined in mythread.h matching the desired
threading method.

Before Windows Vista, there is no direct equivalent to
pthread condition variables. Since this package doesn't
use pthread_cond_broadcast(), pre-Vista threading can
still be kept quite simple. The pre-Vista code doesn't
use anything that wasn't already available in Windows 95,
so the binaries should run even on Windows 95 if someone
happens to care.
2013-09-17 11:52:28 +03:00
Lasse Collin 72975df6c8 Build: Create liblzma.pc in a src/liblzma/Makefile.am.
Previously it was done in configure, but doing that goes
against the Autoconf manual. Autoconf requires that it is
possible to override e.g. prefix after running configure
and that doesn't work correctly if liblzma.pc is created
by configure.

A potential downside of this change is that now e.g.
libdir in liblzma.pc is a standalone string instead of
being defined via ${prefix}, so if one overrides prefix
when running pkg-config the libdir won't get the new value.
I don't know if this matters in practice.

Thanks to Vincent Torri.
2013-09-09 20:37:03 +03:00
Lasse Collin 1c2b6e7e83 Fix the previous commit which broke the build.
Apparently I didn't even compile-test the previous commit.

Thanks to Christian Hesse.
2013-08-04 15:24:09 +03:00
Lasse Collin 124eb69c78 Windows: Add Windows support to tuklib_cpucores().
It is used for Cygwin too. I'm not sure if that is
a good or bad idea.

Thanks to Vincent Torri.
2013-08-03 13:52:58 +03:00
Lasse Collin dee6ad3d59 xz: Add preliminary support for --flush-timeout=TIMEOUT.
When --flush-timeout=TIMEOUT is used, xz will use
LZMA_SYNC_FLUSH if read() would block and at least
TIMEOUT milliseconds has elapsed since the previous flush.

This can be useful in realtime-like use cases where the
data is simultanously decompressed by another process
(possibly on a different computer). If new uncompressed
input data is produced slowly, without this option xz could
buffer the data for a long time until it would become
decompressible from the output.

If TIMEOUT is 0, the feature is disabled. This is the default.

This commit affects the compression side. Using xz for
the decompression side for the above purpose doesn't work
yet so well because there is quite a bit of input and
output buffering when decompressing.

The --long-help or man page were not updated yet.
The details of this feature may change.
2013-07-04 14:18:46 +03:00
Lasse Collin fa381acaf9 xz: Don't set src_eof=true after an I/O error because it's useless. 2013-07-04 13:41:03 +03:00
Lasse Collin ea00545bea xz: Fix the test when to read more input.
Testing for end of file was no longer correct after full flushing
became possible with --block-size=SIZE and --block-list=SIZES.
There was no bug in practice though because xz just made a few
unneeded zero-byte reads.
2013-07-04 13:25:11 +03:00
Lasse Collin 736903c64b xz: Move some of the timing code into mytime.[hc].
This switches units from microseconds to milliseconds.

New clock_gettime(CLOCK_MONOTONIC) will be used if available.
There is still a fallback to gettimeofday().
2013-07-04 12:51:57 +03:00
Lasse Collin c0627b3fce xz: Silence a warning seen with _FORTIFY_SOURCE=2.
Thanks to Christian Hesse.
2013-07-01 14:34:11 +03:00
Lasse Collin a37ae8b5eb Man pages: Use similar syntax for synopsis as in xz.
The man pages of lzmainfo, xzmore, and xzdec had similar
constructs as the man page of xz had before the commit
eb6ca9854b. Eric S. Raymond
didn't mention these man pages in his bug report, but
it's nice to be consistent.
2013-06-30 18:02:27 +03:00
Lasse Collin cdba9ddd87 xz: Use non-blocking I/O for the output file.
Now both reading and writing should be without
race conditions with signals.

They might still be signal handling issues left.
Signals are blocked during many operations to avoid
EINTR but it may cause problems e.g. if writing to
stderr blocks when trying to display an error message.
2013-06-29 15:59:13 +03:00
Lasse Collin e61a5c95da xz: Fix return value type in io_write_buf().
It didn't affect the behavior of the code since -1
becomes true anyway.
2013-06-28 23:56:17 +03:00
Lasse Collin 9dc319eabb xz: Use the self-pipe trick to avoid a race condition with signals.
It is possible that a signal to set user_abort arrives right
before a blocking system call is made. In this case the call
may block until another signal arrives, while the wanted
behavior is to make xz clean up and exit as soon as possible.

After this commit, the race condition is avoided with the
input side which already uses non-blocking I/O. The output
side still uses blocking I/O and thus has the race condition.
2013-06-28 23:48:05 +03:00
Lasse Collin 3541bc79d0 xz: Use non-blocking I/O for the input file. 2013-06-28 22:51:02 +03:00
Lasse Collin 78673a08be xz: Remove an outdated NetBSD-specific comment.
Nowadays errno == EFTYPE is documented in open(2).
2013-06-28 18:46:13 +03:00
Lasse Collin a616fdad34 xz: Fix error detection of fcntl(fd, F_SETFL, flags) calls.
POSIX says that fcntl(fd, F_SETFL, flags) returns -1 on
error and "other than -1" on success. This is how it is
documented e.g. on OpenBSD too. On Linux, success with
F_SETFL is always 0 (at least accorinding to fcntl(2)
from man-pages 3.51).
2013-06-28 18:09:47 +03:00
Lasse Collin 4a08a6e4c6 xz: Fix use of wrong variable in a fcntl() call.
Due to a wrong variable name, when writing a sparse file
to standard output, *all* file status flags were cleared
(to the extent the operating system allowed it) instead of
only clearing the O_APPEND flag. In practice this worked
fine in the common situations on GNU/Linux, but I didn't
check how it behaved elsewhere.

The original flags were still restored correctly. I still
changed the code to use a separate boolean variable to
indicate when the flags should be restored instead of
relying on a special value in stdout_flags.
2013-06-28 17:36:47 +03:00
Lasse Collin b790b435da xz: Fix assertion related to posix_fadvise().
Input file can be a FIFO or something else that doesn't
support posix_fadvise() so don't check the return value
even with an assertion. Nothing bad happens if the call
to posix_fadvise() fails.
2013-06-28 14:55:37 +03:00
Lasse Collin 84d2da6c9d xz: Check the value of lzma_stream_flags.version in --list.
It is a no-op for now, but if an old xz version is used
together with a newer liblzma that supports something new,
then this check becomes important and will stop the old xz
from trying to parse files that it won't understand.
2013-06-26 13:30:57 +03:00
Lasse Collin 46540e4c10 liblzma: Avoid a warning about a shadowed variable.
On Mac OS X wait() is declared in <sys/wait.h> that
we include one way or other so don't use "wait" as
a variable name.

Thanks to Christian Kujau.
2013-06-23 18:57:23 +03:00
Lasse Collin ebb501ec73 xz: Validate Uncompressed Size from Block Header in list.c.
This affects only "xz -lvv". Normal decompression with xz
already detected if Block Header and Index had mismatched
Uncompressed Size fields. So this just makes "xz -lvv"
show such files as corrupt instead of showing the
Uncompressed Size from Index.
2013-06-23 17:36:47 +03:00
Lasse Collin eb6ca9854b xz: Make the man page more friendly to doclifter.
Thanks to Eric S. Raymond.
2013-06-21 22:04:45 +03:00
Lasse Collin 0c0a1947e6 xz: A couple of man page fixes.
Now the interaction of presets and custom filter chains
is described correctly. Earlier it contradicted itself.

Thanks to DevHC who reported these issues on IRC to me
on 2012-12-14.
2013-06-21 21:54:59 +03:00
Lasse Collin 2fcda89939 xz: Fix interaction between preset and custom filter chains.
There was somewhat illogical behavior when --extreme was
specified and mixed with custom filter chains.

Before this commit, "xz -9 --lzma2 -e" was equivalent
to "xz --lzma2". After it is equivalent to "xz -6e"
(all earlier preset options get forgotten when a custom
filter chain is specified and the default preset is 6
to which -e is applied). I find this less illogical.

This also affects the meaning of "xz -9e --lzma2 -7".
Earlier it was equivalent to "xz -7e" (the -e specified
before a custom filter chain wasn't forgotten). Now it
is "xz -7". Note that "xz -7e" still is the same as "xz -e7".

Hopefully very few cared about this in the first place,
so pretty much no one should even notice this change.

Thanks to Conley Moorhous.
2013-06-21 21:50:26 +03:00
Lasse Collin 8957c58609 xzdec: Improve the --help message.
The options are now ordered in the same order as in xz's help
message.

Descriptions were added to the options that are ignored.
I left them in parenthesis even if it looks a bit weird
because I find it easier to spot the ignored vs. non-ignored
options from the list that way.
2013-04-15 19:29:09 +03:00
Jeff Bastian 5019413a05 xzgrep: make the '-h' option to be --no-filename equivalent
* src/scripts/xzgrep.in: Accept the '-h' option in argument parsing.
2013-04-05 19:14:50 +03:00
Lasse Collin 5ea900cb5a liblzma: Be less picky in lzma_alone_decoder().
To avoid false positives when detecting .lzma files,
rare values in dictionary size and uncompressed size fields
were rejected. They will still be rejected if .lzma files
are decoded with lzma_auto_decoder(), but when using
lzma_alone_decoder() directly, such files will now be accepted.
Hopefully this is an OK compromise.

This doesn't affect xz because xz still has its own file
format detection code. This does affect lzmadec though.
So after this commit lzmadec will accept files that xz or
xz-emulating-lzma doesn't.

NOTE: lzma_alone_decoder() still won't decode all .lzma files
because liblzma's LZMA decoder doesn't support lc + lp > 4.

Reported here:
http://sourceforge.net/projects/lzmautils/forums/forum/708858/topic/7068827
2013-03-23 22:25:15 +02:00
Lasse Collin bb117fffa8 liblzma: Use lzma_block_buffer_bound64() in threaded encoder.
Now it uses lzma_block_uncomp_encode() if the data doesn't
fit into the space calculated by lzma_block_buffer_bound64().
2013-03-23 21:55:13 +02:00
Lasse Collin e572e123b5 liblzma: Fix another deadlock in the threaded encoder.
This race condition could cause a deadlock if lzma_end() was
called before finishing the encoding. This can happen with
xz with debugging enabled (non-debugging version doesn't
call lzma_end() before exiting).
2013-03-23 21:51:38 +02:00
Lasse Collin b465da5988 liblzma: Add lzma_block_uncomp_encode().
This also adds a new internal function
lzma_block_buffer_bound64() which is similar to
lzma_block_buffer_bound() but uses uint64_t instead
of size_t.
2013-03-23 19:17:33 +02:00
Lasse Collin 9e6dabcf22 Avoid unneeded use of awk in xzless.
Use "read" instead of "awk" in xzless to get the version
number of "less". The need for awk was introduced in
the commit db5c1817fa.

Thanks to Ariel P for the patch.
2013-03-05 19:14:50 +02:00
Lasse Collin e7b424d267 Make the progress indicator smooth in threaded mode.
This adds lzma_get_progress() to liblzma and takes advantage
of it in xz.

lzma_get_progress() collects progress information from
the thread-specific structures so that fairly accurate
progress information is available to applications. Adding
a new function seemed to be a better way than making the
information directly available in lzma_stream (like total_in
and total_out are) because collecting the information requires
locking mutexes. It's waste of time to do it more often than
the up to date information is actually needed by an application.
2012-12-14 20:13:32 +02:00
Lasse Collin 2ebbb994e3 liblzma: Fix mythread_sync for nested locking. 2012-12-14 11:01:41 +02:00
Lasse Collin 4c7e28705f xz: Mention --threads in --help.
Thanks to Olivier Delhomme for pointing out that this
was still missing.
2012-12-13 21:05:36 +02:00
Jonathan Nieder db5c1817fa xzless: Make "less -V" parsing more robust
In v4.999.9beta~30 (xzless: Support compressed standard input,
2009-08-09), xzless learned to parse ‘less -V’ output to figure out
whether less is new enough to handle $LESSOPEN settings starting
with “|-”.  That worked well for a while, but the version string from
‘less’ versions 448 (June, 2012) is misparsed, producing a warning:

	$ xzless /tmp/test.xz; echo $?
	/usr/bin/xzless: line 49: test: 456 (GNU regular expressions): \
	integer expression expected
	0

More precisely, modern ‘less’ lists the regexp implementation along
with its version number, and xzless passes the entire version number
with attached parenthetical phrase as a number to "test $a -gt $b",
producing the above confusing message.

	$ less-444 -V | head -1
	less 444
	$ less -V | head -1
	less 456 (no regular expressions)

So relax the pattern matched --- instead of expecting "less <number>",
look for a line of the form "less <number>[ (extra parenthetical)]".
While at it, improve the behavior when no matching line is found ---
instead of producing a cryptic message, we can fall back on a LESSPIPE
setting that is supported by all versions of ‘less’.

The implementation uses "awk" for simplicity.  Hopefully that’s
portable enough.

Reported-by: Jörg-Volker Peetz <jvpeetz@web.de>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2012-11-21 19:19:44 +02:00
Lasse Collin 65536214a3 xz: Fix the note about --rsyncable on the man page. 2012-10-03 15:54:24 +03:00
Lasse Collin 3d93b63549 xz: Improve handling of failed realloc in xrealloc.
Thanks to Jim Meyering.
2012-09-28 20:11:09 +03:00
Lasse Collin ab22562066 A few typo fixes to comments and the xz man page.
Thanks to Jim Meyering.
2012-08-24 16:27:31 +03:00
Lasse Collin f3c1ec69d9 xz: Add a warning to --help about alpha and beta versions. 2012-08-13 21:40:09 +03:00
Lasse Collin 3778db1be5 liblzma: Make the use of lzma_allocator const-correct.
There is a tiny risk of causing breakage: If an application
assigns lzma_stream.allocator to a non-const pointer, such
code won't compile anymore. I don't know why anyone would do
such a thing though, so in practice this shouldn't cause trouble.

Thanks to Jan Kratochvil for the patch.
2012-07-17 18:19:59 +03:00
Lasse Collin d6e0b23d46 Build: Include validate_map.sh in the distribution.
It's required by "make mydist".

Fix also the location of EXTRA_DIST+= so that those files
get distributed also if symbol versioning isn't enabled.
2012-07-05 07:28:53 +03:00
Lasse Collin cafb523ada xz: Document --block-list better.
Thanks to Jonathan Nieder.
2012-07-04 22:31:58 +03:00
Lasse Collin c7ff218528 Bump the version number to 5.1.2alpha. 2012-07-04 20:01:49 +03:00
Lasse Collin 0d5fa05466 xz: Fix the version number printed by xz -lvv.
The decoder bug was fixed in 5.0.2 instead of 5.0.3.
2012-07-04 19:58:23 +03:00
Lasse Collin 88ccf47205 xz: Add incomplete support for --block-list.
It's broken with threads and when also --block-size is used.
2012-07-03 21:16:39 +03:00
Lasse Collin 972179cdcd xz: Update the man page about the new field in --robot -lvv. 2012-07-01 18:44:33 +03:00
Lasse Collin 1403707fc6 liblzma: Check that the first byte of range encoded data is 0x00.
It is just to be more pedantic and thus perhaps catch broken
files slightly earlier.
2012-06-28 10:47:49 +03:00
Lasse Collin 2e6754eac2 xz: Update man page date to match the latest update. 2012-06-22 14:34:03 +03:00
Lasse Collin d8db706acb liblzma: Fix possibility of incorrect LZMA_BUF_ERROR.
lzma_code() could incorrectly return LZMA_BUF_ERROR if
all of the following was true:

  - The caller knows how many bytes of output to expect
    and only provides that much output space.

  - When the last output bytes are decoded, the
    caller-provided input buffer ends right before
    the LZMA2 end of payload marker. So LZMA2 won't
    provide more output anymore, but it won't know it
    yet and thus won't return LZMA_STREAM_END yet.

  - A BCJ filter is in use and it hasn't left any
    unfiltered bytes in the temp buffer. This can happen
    with any BCJ filter, but in practice it's more likely
    with filters other than the x86 BCJ.

Another situation where the bug can be triggered happens
if the uncompressed size is zero bytes and no output space
is provided. In this case the decompression can fail even
if the whole input file is given to lzma_code().

A similar bug was fixed in XZ Embedded on 2011-09-19.
2012-05-28 20:42:11 +03:00
Lasse Collin 7769ea051d xz: Don't show a huge number in -vv when memory limit is disabled. 2012-05-28 15:37:43 +03:00
Lasse Collin ec92110572 xz: Document the "summary" lines of --robot -lvv.
This documents only the columns that are in v5.0.
The new columns added in the master branch aren't
necessarily stable yet.
2012-05-27 22:30:17 +03:00
Lasse Collin 27d24eb0a9 xz: Fix output of verbose --robot --list modes.
It printed the filename in "filename (x/y)" format
which it obviously shouldn't do in robot mode.
2012-05-27 21:53:20 +03:00
Lasse Collin aac1b31ea4 liblzma: Remove outdated comments. 2012-04-19 15:25:26 +03:00
Lasse Collin 03ed742a3a liblzma: Fix Libs.private in liblzma.pc to include -lrt when needed. 2012-04-19 14:02:25 +03:00
Lasse Collin cff070aba6 Fix exit status of xzgrep when grepping binary files.
When grepping binary files, grep may exit before it has
read all the input. In this case, gzip -q returns 2 (eating
SIGPIPE), but xz and bzip2 show SIGPIPE as the exit status
(e.g. 141). This causes wrong exit status when grepping
xz- or bzip2-compressed binary files.

The fix checks for the special exit status that indicates SIGPIPE.
It uses kill -l which should be supported everywhere since it
is in both SUSv2 (1997) and POSIX.1-2008.

Thanks to James Buren for the bug report.
2012-02-22 14:02:34 +02:00
Lasse Collin 418fe668b3 xz: Show minimum required XZ Utils version in xz -lvv.
Man page wasn't updated yet.
2011-11-07 13:07:52 +02:00
Lasse Collin 7081d82c37 xz: Fix a typo in a comment.
Thanks to Bela Lubkin.
2011-11-04 17:57:16 +02:00
Lasse Collin 74d2bae4d3 xz: Fix xz on EBCDIC systems.
Thanks to Chris Donawa.
2011-11-03 17:07:22 +02:00
Lasse Collin ab50ae3ef4 liblzma: Fix invalid free() in the threaded encoder.
It was triggered if initialization failed e.g. due to
running out of memory.

Thanks to Arkadiusz Miskiewicz.
2011-10-23 17:08:14 +03:00
Lasse Collin 6b620a0f08 liblzma: Fix a deadlock in the threaded encoder.
It was triggered when reinitializing the encoder,
e.g. when encoding two files.
2011-10-23 17:05:55 +03:00
Lasse Collin 5b1e1f1074 Workaround unusual SIZE_MAX on SCO OpenServer. 2011-08-09 21:16:44 +03:00
Lasse Collin 1c673e5681 Fix exit status of "xzdiff foo.xz bar.xz".
xzdiff was clobbering the exit status from diff in a case
statement used to analyze the exit statuses from "xz" when
its operands were two compressed files. Save and restore
diff's exit status to fix this.

The bug is inherited from zdiff in GNU gzip and was fixed
there on 2009-10-09.

Thanks to Jonathan Nieder for the patch and
to Peter Pallinger for reporting the bug.
2011-07-31 11:01:47 +03:00
Lasse Collin 324cde7a86 liblzma: Remove unneeded semicolon. 2011-06-16 12:15:29 +03:00
Lasse Collin fc4d443696 Don't call close(-1) in tuklib_open_stdxxx() on error.
Thanks to Jim Meyering.
2011-05-28 16:43:26 +03:00
Lasse Collin bd35d903a0 liblzma: Use symbol versioning.
Symbol versioning is enabled by default on GNU/Linux,
other GNU-based systems, and FreeBSD.

I'm not sure how stable this is, so it may need
backward-incompatible changes before the next release.

The idea is that alpha and beta symbols are considered
unstable and require recompiling the applications that
use those symbols. Once a symbol is stable, it may get
extended with new features in ways that don't break
compatibility with older ABI & API.

The mydist target runs validate_map.sh which should
catch some probable problems in liblzma.map. Otherwise
I would forget to update the map file for new releases.
2011-05-28 15:55:39 +03:00
Lasse Collin c029744506 xz: Fix error handling in xz -lvv.
It could do an invalid free() and read past the end
of the uninitialized filters array.
2011-05-27 22:25:44 +03:00
Lasse Collin 8bd91918ac liblzma: Handle allocation failures correctly in lzma_index_init().
Thanks to Jim Meyering.
2011-05-27 22:09:49 +03:00
Lasse Collin 4161d76349 xz: Translate also the string used to print the program name.
French needs a space before a colon, e.g. "xz : foo error".
2011-05-21 15:12:10 +03:00
Lasse Collin b94aa0c838 liblzma: Try to use SHA-256 from the operating system.
If the operating system libc or other base libraries
provide SHA-256, use that instead of our own copy.
Note that this doesn't use OpenSSL or libgcrypt or
such libraries to avoid creating dependencies to
other packages.

This supports at least FreeBSD, NetBSD, OpenBSD, Solaris,
MINIX, and Darwin. They all provide similar but not
identical SHA-256 APIs; everyone is a little different.

Thanks to Wim Lewis for the original patch, improvements,
and testing.
2011-05-21 15:08:44 +03:00
Lasse Collin f004128678 Don't use clockid_t in mythread.h when clock_gettime() isn't available.
Thanks to Wim Lewis for the patch.
2011-05-17 12:52:18 +03:00
Lasse Collin 4c6e146df9 Add underscores to attributes (__attribute((__foo__))). 2011-05-17 11:54:38 +03:00
Lasse Collin 7a480e4859 xz: Fix input file position when --single-stream is used.
Now the following works as you would expect:

    echo foo | xz > foo.xz
    echo bar | xz >> foo.xz
    ( xz -dc --single-stream ; xz -dc --single-stream ) < foo.xz

Note that it doesn't work if the input is not seekable
or if there is Stream Padding between the concatenated
.xz Streams.
2011-05-01 12:24:23 +03:00
Lasse Collin c29e6630c1 xz: Print the maximum number of worker threads in xz -vv. 2011-05-01 12:15:51 +03:00
Lasse Collin 9c1b05828a Fix portability problems in mythread.h.
Use gettimeofday() if clock_gettime() isn't available
(e.g. Darwin).

The test for availability of pthread_condattr_setclock()
and CLOCK_MONOTONIC was incorrect. Instead of fixing the
#ifdefs, use an Autoconf test. That way if there exists a
system that supports them but doesn't specify the matching
POSIX #defines, the features will still get detected.

Don't try to use pthread_sigmask() on OpenVMS. It doesn't
have that function.

Guard mythread.h against being #included multiple times.
2011-04-19 09:20:44 +03:00
Martin Väth bd5002f582 xzgrep: fix typo in $0 parsing
Reported-by: Diego Elio Pettenò <flameeyes@gentoo.org>
Signed-off-by: Martin Väth <vaeth@mathematik.uni-wuerzburg.de>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
2011-04-18 19:33:27 +03:00
Lasse Collin 6ef4eabc0a Bump the version number to 5.1.1alpha and liblzma soname to 5.0.99. 2011-04-12 12:48:31 +03:00
Lasse Collin 9a4377be0d Put the unstable APIs behind #ifdef LZMA_UNSTABLE.
This way people hopefully won't complain if these APIs
change and break code that used an older API.
2011-04-12 12:42:37 +03:00
Lasse Collin 3e321a3acd Remove doubled words from documentation and comments.
Spot candidates by running these commands:
  git ls-files |xargs perl -0777 -n \
    -e 'while (/\b(then?|[iao]n|i[fst]|but|f?or|at|and|[dt]o)\s+\1\b/gims)' \
    -e '{$n=($` =~ tr/\n/\n/ + 1); ($v=$&)=~s/\n/\\n/g; print "$ARGV:$n:$v\n"}'

Thanks to Jim Meyering for the original patch.
2011-04-12 11:59:49 +03:00
Lasse Collin 70e750f597 xz: Update the man page about threading. 2011-04-12 11:08:55 +03:00
Lasse Collin 24e0406c0f xz: Add support for threaded compression. 2011-04-11 22:06:03 +03:00
Lasse Collin de678e0c92 liblzma: Add lzma_stream_encoder_mt() for threaded compression.
This is the simplest method to do threading, which splits
the uncompressed data into blocks and compresses them
independently from each other. There's room for improvement
especially to reduce the memory usage, but nevertheless,
this is a good start.
2011-04-11 22:03:30 +03:00
Lasse Collin 25fe729532 liblzma: Add the forgotten lzma_lzma2_block_size().
This should have been in 5eefc0086d.
2011-04-11 21:15:07 +03:00
Lasse Collin 91afb785a1 liblzma: Document lzma_easy_(enc|dec)oder_memusage() better too. 2011-04-11 21:04:13 +03:00
Lasse Collin 4a9905302a liblzma: Document lzma_raw_(enc|dec)oder_memusage() better.
It didn't mention the return value that is used if
an error occurs.
2011-04-11 20:59:07 +03:00
Lasse Collin 0badb0b1bd liblzma: Use memzero() to initialize supported_actions[].
This is cleaner and makes it simpler to add new members
to lzma_action enumeration.
2011-04-11 19:28:18 +03:00
Lasse Collin a7934c446a liblzma: API comment about lzma_allocator with threaded coding. 2011-04-11 19:26:27 +03:00
Lasse Collin 5eefc0086d liblzma: Add an internal function lzma_mt_block_size().
This is based lzma_chunk_size() that was included in some
development version of liblzma.
2011-04-11 19:16:30 +03:00
Lasse Collin d119927475 liblzma: Don't create an empty Block in lzma_stream_buffer_encode().
Empty Block was created if the input buffer was empty.
Empty Block wastes a few bytes of space, but more importantly
it triggers a bug in XZ Utils 5.0.1 and older when trying
to decompress such a file. 5.0.1 and older consider such
files to be corrupt. I thought that no encoder creates empty
Blocks when releasing 5.0.2 but I was wrong.
2011-04-11 13:59:50 +03:00
Lasse Collin 3b22fc2c87 liblzma: Fix API docs to mention LZMA_UNSUPPORTED_CHECK.
This return value was missing from the API comments of
four functions.
2011-04-11 13:28:40 +03:00
Lasse Collin 71b9380145 liblzma: Validate encoder arguments better.
The biggest problem was that the integrity check type
wasn't validated, and e.g. lzma_easy_buffer_encode()
would create a corrupt .xz Stream if given an unsupported
Check ID. Luckily applications don't usually try to use
an unsupport Check ID, so this bug is unlikely to cause
many real-world problems.
2011-04-11 13:21:28 +03:00
Lasse Collin ec7e3dbad7 xz: Move the description of --block-size in --long-help. 2011-04-11 09:57:30 +03:00
Lasse Collin cd3086ff44 Docs: Document --single-stream and --block-size. 2011-04-11 09:55:35 +03:00
Lasse Collin fb64a49243 liblzma: Make lzma_stream_encoder_init() static (second try).
It's an internal function and it's not needed by
anything outside stream_encoder.c.
2011-04-11 09:27:57 +03:00
Lasse Collin a34730cf6a Revert "liblzma: Make lzma_stream_encoder_init() static."
This reverts commit 352ac82db5.
I don't know what I was thinking.
2011-04-11 08:31:42 +03:00
Lasse Collin 9f0a806aef Revise mythread.h.
This adds:

  - mythread_sync() macro to create synchronized blocks

  - mythread_cond structure and related functions
    and macros for condition variables with timed
    waiting using a relative timeout

  - mythread_create() to create a thread with all
    signals blocked

Some of these wouldn't need to be inline functions,
but I'll keep them this way for now for simplicity.

For timed waiting on a condition variable, librt is
now required on some systems to use clock_gettime().
configure.ac was updated to handle this.
2011-04-10 21:23:21 +03:00
Lasse Collin 352ac82db5 liblzma: Make lzma_stream_encoder_init() static.
It's an internal function and it's not needed by
anything outside stream_encoder.c.
2011-04-10 20:37:36 +03:00
Lasse Collin ebd54dbd6e xz/DOS: Add experimental 8.3 filename support.
This is incompatible with the 8.3 support patch made by
Juan Manuel Guerrero. I think this one is nicer, but
I need to get feedback from DOS users before saying
that this is the final version of 8.3 filename support.
2011-04-10 13:09:42 +03:00
Lasse Collin cd4fe97852 xz/DOS: Be more careful with the destination file.
Try to avoid overwriting the source file if --force is
used and the generated destination filename refers to
the source file. This can happen with 8.3 filenames where
extra characters are ignored.

If the generated output file refers to a special file
like "con" or "prn", refuse to write to it even if --force
is used.
2011-04-10 12:47:47 +03:00
Lasse Collin fca396b374 liblzma: Add missing #ifdefs to filter_common.c.
Passing --disable-decoders to configure broke a few
encoders due to missing #ifdefs in filter_common.c.

Thanks to Jason Gorski for the patch.
2011-04-09 18:28:58 +03:00
Lasse Collin b03f6cd3eb xz: Avoid unneeded fstat() on DOS-like systems. 2011-04-09 15:24:59 +03:00
Lasse Collin 335fe260a8 xz: Minor internal changes to handling of --threads.
Now it always defaults to one thread. Maybe this
will change again if a threading method is added
that doesn't affect memory usage.
2011-04-09 15:11:13 +03:00
Lasse Collin 9edd6ee895 xz: Change size_t to uint32_t in a few places. 2011-04-08 17:53:05 +03:00
Lasse Collin 411013ea45 xz: Fix a typo in a comment. 2011-04-08 17:48:41 +03:00
Lasse Collin b34c5ce4b2 liblzma: Use TUKLIB_GNUC_REQ to check GCC version in sha256.c. 2011-04-05 22:41:33 +03:00
Lasse Collin 1039bfcfc0 xz: Use posix_fadvise() if it is available. 2011-04-05 15:27:26 +03:00
Lasse Collin 1ef3cf44a8 xz: Call lzma_end(&strm) before exiting if debugging is enabled. 2011-04-05 15:13:29 +03:00
Lasse Collin bd432015d3 liblzma: Fix a memory leak in stream_encoder.c.
It leaks old filter options structures (hundred bytes or so)
every time the lzma_stream is reinitialized. With the xz tool,
this happens when compressing multiple files.
2011-04-02 14:49:56 +03:00
Lasse Collin 0d21f49a80 liblzma: Fix decoding of LZMA2 streams having no uncompressed data.
The decoder considered empty LZMA2 streams to be corrupt.
This shouldn't matter much with .xz files, because no encoder
creates empty LZMA2 streams in .xz. This bug is more likely
to cause problems in applications that use raw LZMA2 streams.
2011-03-31 11:54:48 +03:00
Lasse Collin 40277998cb Scripts: Better fix for xzgrep.
Now it uses "grep -q".

Thanks to Gregory Margo.
2011-03-24 01:42:49 +02:00
Lasse Collin c7210d9a3f Scripts: Fix xzgrep -l.
It didn't work at all. It tried to use the -q option
for grep, but it appended it after "--". This works
around it by redirecting to /dev/null. The downside
is that this can be slower with big files compared
to proper use of "grep -q".

Thanks to Gregory Margo.
2011-03-24 01:21:32 +02:00
Lasse Collin 4eb83e3204 Scripts: Add lzop (.lzo) support to xzdiff and xzgrep. 2011-03-19 13:08:22 +02:00
Lasse Collin 923b22483b xz: Add --block-size=SIZE.
This uses LZMA_FULL_FLUSH every SIZE bytes of input.

Man page wasn't updated yet.
2011-03-18 19:10:30 +02:00
Lasse Collin 57597d42ca xz: Add --single-stream.
This can be useful when there is garbage after the
compressed stream (.xz, .lzma, or raw stream).

Man page wasn't updated yet.
2011-03-18 18:19:19 +02:00
Lasse Collin 96f94bc925 xz: Clean up suffix.c.
struct suffix_pair isn't needed in compresed_name()
so get rid of it there.
2011-02-06 20:16:14 +02:00
Lasse Collin 8930c7ae3f xz: Check if the file already has custom suffix when compressing.
Now "xz -S .test foo.test" refuses to compress the
file because it already has the suffix .test. The man
page had it documented this way already.
2011-02-06 20:16:14 +02:00
Lasse Collin 6dd061adfd Merge commit '5fbce0b8d96dc96775aa0215e3581addc830e23d' 2011-02-06 20:13:01 +02:00
Lasse Collin 03ebd1bbb3 xz: Fix --force on setuid/setgid/sticky and multi-hardlink files.
xz didn't compress setuid/setgid/sticky files and files
with multiple hard links even with --force. This bug was
introduced in 23ac2c44c3.

Thanks to Charles Wilson.
2011-01-26 12:19:08 +02:00
Lasse Collin 9d542ceebc Merge branch 'v5.0' 2011-01-19 11:45:35 +02:00
Lasse Collin f71c4e16e9 Add alloc_size and malloc attributes to a few functions.
Thanks to Cristian Rodríguez for the original patch.
2011-01-18 21:23:50 +02:00
Lasse Collin 316cbe2446 Scripts: Fix gzip and bzip2 support in xzdiff. 2010-12-13 16:36:33 +02:00
Lasse Collin 4f2c69a4e3 Merge branch 'v5.0' 2010-12-12 23:13:22 +02:00
Lasse Collin ce56f63c41 Add missing PRIx32 and PRIx64 compatibility definitions.
This fixes portability to systems that lack C99 inttypes.h.

Thanks to Juan Manuel Guerrero.
2010-12-12 16:07:11 +02:00
Lasse Collin e6baedddcf DOS-like: Treat \ and : as directory separators in addition to /.
Juan Manuel Guerrero had fixed this in his XZ Utils port
to DOS/DJGPP. The bug affects also Windows and OS/2.
2010-12-12 14:50:04 +02:00
Lasse Collin 7c24e0d1b8 Merge branch 'v5.0' 2010-11-15 14:33:01 +02:00
Lasse Collin 3e564704bc liblzma: Document the return value of lzma_lzma_preset(). 2010-11-15 14:28:26 +02:00
Lasse Collin 974ebe6349 liblzma: Rename a few variables and constants.
This has no semantic changes. I find the new names slightly
more logical and they match the names that are already used
in XZ Embedded.

The name fastpos wasn't changed (not worth the hassle).
2010-10-26 10:36:41 +03:00
Lasse Collin 7c427ec38d Bump version 5.1.0alpha. 2010-10-25 12:59:25 +03:00
Lasse Collin b667a3ef63 Bump version to 5.0.0 and liblzma version-info to 5:0:0. 2010-10-23 14:02:53 +03:00
Lasse Collin 8c947e9291 liblzma: Make lzma_code() check the reserved members in lzma_stream.
If any of the reserved members in lzma_stream are non-zero
or non-NULL, LZMA_OPTIONS_ERROR is returned. It is possible
that a new feature in the future is indicated by just setting
a reserved member to some other value, so the old liblzma
version need to catch it as an unsupported feature.
2010-10-23 12:30:54 +03:00
Lasse Collin e61d85e082 Windows: Use MinGW's stdio functions.
The non-standard ones from msvcrt.dll appear to work
most of the time with XZ Utils, but there are some
corner cases where things may go very wrong. So it's
good to use the better replacements provided by
MinGW(-w64) runtime.
2010-10-23 12:26:33 +03:00
Lasse Collin 23e23f1dc0 liblzma: Use 512 as INDEX_GROUP_SIZE.
This lets compiler use shifting instead of 64-bit division.
2010-10-23 12:21:32 +03:00
Lasse Collin 613939fc82 liblzma: A few ABI tweaks to reserve space in structures. 2010-10-23 12:20:11 +03:00
Lasse Collin 68b83f252d xz: Make sure that message_strm() can never return NULL. 2010-10-21 23:16:11 +03:00
Lasse Collin d09c5753e3 liblzma: Update the comments in the API headers.
Adding support for LZMA_FINISH for Index encoding and
decoding needed tiny additions to the relevant .c files too.
2010-10-21 23:06:31 +03:00
Lasse Collin 0076e03641 Clean up a few FIXMEs and TODOs.
lzma_chunk_size() was commented out because it is
currently useless.
2010-10-19 11:44:37 +03:00
Lasse Collin f0fa880d24 xz: Avoid raise() also on OpenVMS.
This is similar to DOS/DJGPP that killing the program
with a signal will print a backtrace or a similar message.
2010-10-12 15:13:30 +03:00
Lasse Collin ac462b1c47 xz: Avoid SA_RESTART for portability reasons.
SA_RESTART is not as portable as I had hoped. It's missing
at least from OpenVMS, QNX, and DJGPP). Luckily we can do
fine without SA_RESTART.
2010-10-11 21:26:19 +03:00
Lasse Collin d52b411716 xz: Use "%"PRIu32 instead of "%d" in a format string. 2010-10-10 17:58:58 +03:00
Lasse Collin d492b80ddd lzmainfo: Use "%"PRIu32 instead of "%u" for uint32_t. 2010-10-10 16:49:01 +03:00
Lasse Collin 825e859a90 lzmainfo: Use fileno(stdin) instead of STDIN_FILENO. 2010-10-10 16:47:01 +03:00
Lasse Collin acbc4cdecb lzmainfo: Use setmode() on DOS-like systems. 2010-10-09 23:20:51 +03:00
Lasse Collin ef364d3abc OS/2 and DOS: Be less verbose on signals.
Calling raise() to kill xz when user has pressed C-c
is a bit verbose on OS/2 and DOS/DJGPP. Instead of
calling raise(), set only the exit status to 1.
2010-10-09 21:51:03 +03:00
Lasse Collin efeb998a2b lzmainfo: Add Windows resource file. 2010-10-09 13:02:15 +03:00
Lasse Collin 389d418445 Add missing public domain notice to lzmadec_w32res.rc. 2010-10-09 12:57:25 +03:00
Lasse Collin 6389c773a4 Windows: Update common_w32res.rc. 2010-10-09 12:52:12 +03:00
Lasse Collin d3cd7abe85 Use LZMA_VERSION_STRING instead of PACKAGE_VERSION.
Those are the same thing, and the former makes it a bit
easier to build the code with other build systems, because
one doesn't need to update the version number into custom
config.h.

This change affects only lzmainfo. Other tools were already
using LZMA_VERSION_STRING.
2010-10-08 16:53:20 +03:00