Major documentation update.

Installation and packaging instructions were added.
README and other generic docs were revised.

Some of the documentation files are now installed to $docdir.
This commit is contained in:
Lasse Collin 2009-07-19 13:14:20 +03:00
parent ef4cf1851d
commit 99f9e879a6
8 changed files with 1071 additions and 171 deletions

31
AUTHORS
View File

@ -2,17 +2,26 @@
Authors of XZ Utils
===================
Igor Pavlov
* designed LZMA as an algorithm;
* wrote an implementation known as LZMA SDK, which is part of
the bigger 7-Zip project.
XZ Utils is developed and maintained by Lasse Collin
<lasse.collin@tukaani.org>.
Ville Koskinen
* wrote the first version of the gzip-like lzma command line
utility (C++)
* helped a lot with the documentation.
Major parts of liblzma are based on code written by Igor Pavlov,
specifically the LZMA SDK <http://7-zip.org/sdk.html>. Without
this code, XZ Utils wouldn't exist.
Lasse Collin
* ported LZMA SDK to C and zlib-like API (liblzma);
* rewrote the command line tool again to use liblzma and pthreads.
The SHA-256 implementation in liblzma is based on the code found from
7-Zip <http://7-zip.org/>, which has a modified version of the SHA-256
code found from Crypto++ <http://www.cryptopp.com/>. The SHA-256 code
in Crypto++ was written by Kevin Springle and Wei Dai.
Some scripts have been adapted from gzip. The original versions
were written by Jean-loup Gailly, Charles Levert, and Paul Eggert.
Andrew Dudman helped adapting the script and their man pages for
XZ Utils.
The GNU Autotools based build system contains files from many authors,
which I'm not trying list here.
Several people have contributed fixes or reported bugs. Most of them
are mentioned in the file THANKS.

View File

@ -1,2 +1,7 @@
See the commit log in the git repository:
git://ctrl.tukaani.org/xz.git
Note that "make dist" doesn't put this tiny file into the package.
Instead, the git commit log is used as ChangeLog. See dist-hook in
Makefile.am for details.

327
INSTALL Normal file
View File

@ -0,0 +1,327 @@
XZ Utils Installation
=====================
0. Preface
1. Supported platforms
1.1. Compilers
1.2. Platform-specific notes
1.2.1. Darwin (Mac OS X)
1.2.2. Tru64
1.2.3. Windows
1.2.4. DOS
1.2.5. OS/2
1.3. Adding support for new platforms
2. configure options
3. xzgrep and other scripts
3.1. Dependencies
3.2. PATH
4. Troubleshooting
4.1. "No C99 compiler was found."
4.1. "No POSIX conforming shell (sh) was found."
4.2. configure works but build fails at crc32_x86.S
0. Preface
----------
If you aren't familiar with building packages that use GNU Autotools,
see the file INSTALL.generic for generic instructions before reading
further.
If you are going to build a package for distribution, see also the
file PACKAGERS. It contains information that should help making the
binary packages as good as possible, but the information isn't very
interesting to those making local builds for private use or for use
in special situations like embedded systems.
1. Supported platforms
----------------------
XZ Utils are developed on GNU/Linux, but they should work on many
POSIX-like operating systems like *BSDs and Solaris, and even on
a few non-POSIX operating systems.
1.1. Compilers
A C99 compiler is required to compile XZ Utils. If you use GCC, you
need at least version 3.x.x. GCC version 2.xx.x doesn't support some
C99 features used in XZ Utils source code, thus GCC 2 won't compile
XZ Utils.
XZ Utils takes advantage of some GNU C extensions when building
with GCC. Because these extensions are used only when building
with GCC, it should be possible to use any C99 compiler.
1.2. Platform-specific notes
1.2.1. Darwin (Mac OS X)
You may need --disable-assembler if building universal binaries on
Darwin. This is because different files are built when assembler is
enabled, and there's no way to make it work with universal build.
If you want to keep the assembler code, consider building one
architecture at a time, and then combining the results to create
universal binaries (see lipo(1)).
1.2.2. Tru64
If you try to use the native C compiler on Tru64 (passing CC=cc to
configure), it is possible that the configure script will complain
that no C99 compiler was found even when the native compiler supports
C99. You can safely override the test for C99 compiler by passing
ac_cv_prog_cc_c99= as the argument to the configure script.
1.2.3. Windows
Building XZ Utils on Windows is supported under MinGW and Cygwin.
If the Autotools based build gives you trouble with MinGW, you may
want try the alternative method found from the "windows" directory.
MSVC doesn't support C99, thus it is not possible to use MSVC to
compile XZ Utils. However, it is possible to use liblzma.dll from
MSVC once liblzma.dll has been built with MinGW. The required
import library for MSVC can be created from liblzma.def using the
"lib" command shipped in MSVC:
lib /def:liblzma.def /out:liblzma.lib /machine:ix86
On x86-64, the /machine argument has to naturally be changed:
lib /def:liblzma.def /out:liblzma.lib /machine:x64
1.2.4. DOS
There is an experimental Makefile in the "dos" directory to build
XZ Utils on DOS using DJGPP. Support for long file names (LFN) is
needed.
GNU Autotools based build hasn't been tried on DOS.
1.2.5. OS/2
You will need to pass --disable-assembler to configure when building
on OS/2.
1.3. Adding support for new platforms
If you have written patches to make XZ Utils to work on previously
unsupported platform, please send the patches to me! I will consider
including them to the official version. It's nice to minimize the
need of third-party patching.
One exception: Don't request or send patches to change the whole
source package to C89. I find C99 substantially nicer to write and
maintain. However, the public library headers must be in C89 to
avoid frustrating those who maintain programs, which are strictly
in C89 or C++.
2. configure options
--------------------
In most cases, the defaults are what you want. Most of the options
below are useful only when building a size-optimized version of
liblzma or command line tools.
--enable-encoders=LIST
--disable-encoders
Specify a comma-separated LIST of filter encoders to
build. See "./configure --help" for exact list of
available filter encoders. The default is to build all
supported encoders.
If LIST is empty or --disable-encoders is used, no filter
encoders will be built and also the code shared between
encoders will be omitted.
Disabling encoders will remove some symbols from the
liblzma ABI, so this option should be used only when it
is known to not cause problems.
--enable-decoders=LIST
--disable-decoders
This is like --enable-encoders but for decoders. The
default is to build all supported decoders.
--enable-match-finders=LIST
liblzma includes two categories of match finders:
hash chains and binary trees. Hash chains (hc3 and hc4)
are quite fast but they don't provide the best compression
ratio. Binary trees (bt2, bt3 and bt4) give excellent
compression ratio, but they are slower and need more
memory than hash chains.
You need to enable at least one match finder to build the
LZMA1 or LZMA2 filter encoders. Usually hash chains are
used only in the fast mode, while binary trees are used to
when the best compression ratio is wanted.
The default is to build all the match finders if LZMA1
or LZMA2 filter encoders are being built.
--enable-checks=LIST
liblzma support multiple integrity checks. CRC32 is
mandatory, and cannot be omitted. See "./configure --help"
for exact list of available integrity check types.
liblzma and the command line tools can decompress files
which use unsupported integrity check type, but naturally
the file integrity cannot be verified in that case.
Disabling integrity checks may remove some symbols from
the liblzma ABI, so this option should be used only when
it is known to not cause problems.
--disable-assembler
liblzma includes some assembler optimizations. Currently
there is only assembler code for CRC32 and CRC64 for
32-bit x86.
All the assembler code in liblzma is position-independent
code, which is suitable for use in shared libraries and
position-independent executables. So far only i386
instructions are used, but the code is optimized for i686
class CPUs. If you are compiling liblzma exclusively for
pre-i686 systems, you may want to disable the assembler
code.
--enable-unaligned-access
Allow liblzma to use unaligned memory access for 16-bit
and 32-bit loads and stores. This should be enabled only
when the hardware supports this, i.e. when unaligned
access is fast. Some operating system kernels emulate
unaligned access, which is extremely slow. This option
shouldn't be used on systems that rely on such emulation.
Unaligned access is enabled by default on x86, x86-64,
and big endian PowerPC.
--enable-small
Reduce the size of liblzma by selecting smaller but
semantically equivalent version of some functions, and
omit precomputed lookup tables. This option tends to
make liblzma slightly slower.
Note that while omitting the precomputed tables makes
liblzma smaller on disk, the tables are still needed at
run time, and need to be computed at startup. This also
means that the RAM holding the tables won't be shared
between applications linked against shared liblzma.
--disable-threads
Disable threading support. This makes some things
thread-unsafe, meaning that if multithreaded application
calls liblzma functions from more than one thread,
something bad may happen.
Use this option if threading support causes you trouble,
or if you know that you will use liblzma only from
single-threaded applications and want to avoid dependency
on libpthread.
--enable-dynamic
Link the command line tools against shared liblzma. The
default (and recommended way) is to link the command line
tools against static liblzma.
This option is mostly useful for packagers, if distro
policy requires linking against shared libaries. See the
file PACKAGERS for more information about pros and cons
of this option.
--enable-debug
This enables the assert() macro and possibly some other
run-time consistency checks. It makes the code slower, so
you normally don't want to have this enabled.
--enable-werror
If building with GCC, make all compiler warnings an error,
that abort the compilation. This may help catching bugs,
and should work on most systems. This has no effect on the
resulting binaries.
3. xzgrep and other scripts
---------------------------
3.1. Dependencies
POSIX shell (sh) and bunch of other standard POSIX tools are required
to run the scripts. The configure script tries to find a POSIX
compliant sh, but if it fails, you can force the shell by passing
gl_cv_posix_shell=/path/to/posix-sh as an argument to the configure
script.
Some of the scripts require also mktemp. The original mktemp can be
found from <http://www.mktemp.org/>. On GNU, most will use the mktemp
program from GNU coreutils instead of the original implementation.
Both mktemp versions are fine for XZ Utils (and practically for
everything else too).
3.2. PATH
The scripts assume that the required tools (standard POSIX utilities,
mktemp, and xz) are in PATH; the scripts don't set the PATH themselves.
Some people like this while some think this is a bug. Those in the
latter group can easily patch the scripts before running the configure
script by taking advantage of a placeholder line in the scripts.
For example, to make the scripts prefix /usr/bin:/bin to PATH:
perl -pi -e 's|^#SET_PATH.*$|PATH=/usr/bin:/bin:\$PATH|' \
src/scripts/xz*.in
4. Troubleshooting
------------------
4.1. "No C99 compiler was found."
You need a C99 compiler to build XZ Utils. If the configure script
cannot find a C99 compiler and you think you have such a compiler
installed, set the compiler command by passing CC=/path/to/c99 as
an argument to the configure script.
If you get this error even when you think your compiler supports C99,
you can override the test by passing ac_cv_prog_cc_c99= as an argument
to the configure script. The test for C99 compiler is not perfect (and
it is not as easy to make it perfect as it sounds), so sometimes this
may be needed. You will get a compile error if your compiler doesn't
support enough C99.
4.1. "No POSIX conforming shell (sh) was found."
xzgrep and other scripts need a shell that (roughly) conforms
to POSIX. The configure script tries to find such a shell. If
it fails, you can force the shell to be used by passing
gl_cv_posix_shell=/path/to/posix-sh as an argument to the configure
script.
4.2. configure works but build fails at crc32_x86.S
The easy fix is to pass --disable-assembler to the configure script.
The configure script determines if assembler code can be used by
looking at the configure triplet; there is currently no check if
the assembler code can actually actually be built. The x86 assembler
code should work on x86 GNU/Linux, *BSDs, Solaris, Darwin, MinGW,
Cygwin, and DJGPP. On other x86 systems, there may be problems and
the assembler code may need to be disabled with the configure option.
If you get this error when building for x86-64, you have specified or
the configure script has misguessed your architecture. Pass the
correct configure triplet using the --build=CPU-COMPANY-SYSTEM option
(see INSTALL.generic).

302
INSTALL.generic Normal file
View File

@ -0,0 +1,302 @@
Installation Instructions
*************************
Copyright (C) 1994, 1995, 1996, 1999, 2000, 2001, 2002, 2004, 2005,
2006, 2007, 2008, 2009 Free Software Foundation, Inc.
This file is free documentation; the Free Software Foundation gives
unlimited permission to copy, distribute and modify it.
Basic Installation
==================
Briefly, the shell commands `./configure; make; make install' should
configure, build, and install this package. The following
more-detailed instructions are generic; see the `README' file for
instructions specific to this package.
The `configure' shell script attempts to guess correct values for
various system-dependent variables used during compilation. It uses
those values to create a `Makefile' in each directory of the package.
It may also create one or more `.h' files containing system-dependent
definitions. Finally, it creates a shell script `config.status' that
you can run in the future to recreate the current configuration, and a
file `config.log' containing compiler output (useful mainly for
debugging `configure').
It can also use an optional file (typically called `config.cache'
and enabled with `--cache-file=config.cache' or simply `-C') that saves
the results of its tests to speed up reconfiguring. Caching is
disabled by default to prevent problems with accidental use of stale
cache files.
If you need to do unusual things to compile the package, please try
to figure out how `configure' could check whether to do them, and mail
diffs or instructions to the address given in the `README' so they can
be considered for the next release. If you are using the cache, and at
some point `config.cache' contains results you don't want to keep, you
may remove or edit it.
The file `configure.ac' (or `configure.in') is used to create
`configure' by a program called `autoconf'. You need `configure.ac' if
you want to change it or regenerate `configure' using a newer version
of `autoconf'.
The simplest way to compile this package is:
1. `cd' to the directory containing the package's source code and type
`./configure' to configure the package for your system.
Running `configure' might take a while. While running, it prints
some messages telling which features it is checking for.
2. Type `make' to compile the package.
3. Optionally, type `make check' to run any self-tests that come with
the package.
4. Type `make install' to install the programs and any data files and
documentation.
5. You can remove the program binaries and object files from the
source code directory by typing `make clean'. To also remove the
files that `configure' created (so you can compile the package for
a different kind of computer), type `make distclean'. There is
also a `make maintainer-clean' target, but that is intended mainly
for the package's developers. If you use it, you may have to get
all sorts of other programs in order to regenerate files that came
with the distribution.
6. Often, you can also type `make uninstall' to remove the installed
files again.
Compilers and Options
=====================
Some systems require unusual options for compilation or linking that
the `configure' script does not know about. Run `./configure --help'
for details on some of the pertinent environment variables.
You can give `configure' initial values for configuration parameters
by setting variables in the command line or in the environment. Here
is an example:
./configure CC=c99 CFLAGS=-g LIBS=-lposix
*Note Defining Variables::, for more details.
Compiling For Multiple Architectures
====================================
You can compile the package for more than one kind of computer at the
same time, by placing the object files for each architecture in their
own directory. To do this, you can use GNU `make'. `cd' to the
directory where you want the object files and executables to go and run
the `configure' script. `configure' automatically checks for the
source code in the directory that `configure' is in and in `..'.
With a non-GNU `make', it is safer to compile the package for one
architecture at a time in the source code directory. After you have
installed the package for one architecture, use `make distclean' before
reconfiguring for another architecture.
On MacOS X 10.5 and later systems, you can create libraries and
executables that work on multiple system types--known as "fat" or
"universal" binaries--by specifying multiple `-arch' options to the
compiler but only a single `-arch' option to the preprocessor. Like
this:
./configure CC="gcc -arch i386 -arch x86_64 -arch ppc -arch ppc64" \
CXX="g++ -arch i386 -arch x86_64 -arch ppc -arch ppc64" \
CPP="gcc -E" CXXCPP="g++ -E"
This is not guaranteed to produce working output in all cases, you
may have to build one architecture at a time and combine the results
using the `lipo' tool if you have problems.
Installation Names
==================
By default, `make install' installs the package's commands under
`/usr/local/bin', include files under `/usr/local/include', etc. You
can specify an installation prefix other than `/usr/local' by giving
`configure' the option `--prefix=PREFIX'.
You can specify separate installation prefixes for
architecture-specific files and architecture-independent files. If you
pass the option `--exec-prefix=PREFIX' to `configure', the package uses
PREFIX as the prefix for installing programs and libraries.
Documentation and other data files still use the regular prefix.
In addition, if you use an unusual directory layout you can give
options like `--bindir=DIR' to specify different values for particular
kinds of files. Run `configure --help' for a list of the directories
you can set and what kinds of files go in them.
If the package supports it, you can cause programs to be installed
with an extra prefix or suffix on their names by giving `configure' the
option `--program-prefix=PREFIX' or `--program-suffix=SUFFIX'.
Optional Features
=================
Some packages pay attention to `--enable-FEATURE' options to
`configure', where FEATURE indicates an optional part of the package.
They may also pay attention to `--with-PACKAGE' options, where PACKAGE
is something like `gnu-as' or `x' (for the X Window System). The
`README' should mention any `--enable-' and `--with-' options that the
package recognizes.
For packages that use the X Window System, `configure' can usually
find the X include and library files automatically, but if it doesn't,
you can use the `configure' options `--x-includes=DIR' and
`--x-libraries=DIR' to specify their locations.
Particular systems
==================
On HP-UX, the default C compiler is not ANSI C compatible. If GNU
CC is not installed, it is recommended to use the following options in
order to use an ANSI C compiler:
./configure CC="cc -Ae -D_XOPEN_SOURCE=500"
and if that doesn't work, install pre-built binaries of GCC for HP-UX.
On OSF/1 a.k.a. Tru64, some versions of the default C compiler cannot
parse its `<wchar.h>' header file. The option `-nodtk' can be used as
a workaround. If GNU CC is not installed, it is therefore recommended
to try
./configure CC="cc"
and if that doesn't work, try
./configure CC="cc -nodtk"
On Solaris, don't put `/usr/ucb' early in your `PATH'. This
directory contains several dysfunctional programs; working variants of
these programs are available in `/usr/bin'. So, if you need `/usr/ucb'
in your `PATH', put it _after_ `/usr/bin'.
On Haiku, software installed for all users goes in `/boot/common',
not `/usr/local'. It is recommended to use the following options:
./configure --prefix=/boot/common
Specifying the System Type
==========================
There may be some features `configure' cannot figure out
automatically, but needs to determine by the type of machine the package
will run on. Usually, assuming the package is built to be run on the
_same_ architectures, `configure' can figure that out, but if it prints
a message saying it cannot guess the machine type, give it the
`--build=TYPE' option. TYPE can either be a short name for the system
type, such as `sun4', or a canonical name which has the form:
CPU-COMPANY-SYSTEM
where SYSTEM can have one of these forms:
OS
KERNEL-OS
See the file `config.sub' for the possible values of each field. If
`config.sub' isn't included in this package, then this package doesn't
need to know the machine type.
If you are _building_ compiler tools for cross-compiling, you should
use the option `--target=TYPE' to select the type of system they will
produce code for.
If you want to _use_ a cross compiler, that generates code for a
platform different from the build platform, you should specify the
"host" platform (i.e., that on which the generated programs will
eventually be run) with `--host=TYPE'.
Sharing Defaults
================
If you want to set default values for `configure' scripts to share,
you can create a site shell script called `config.site' that gives
default values for variables like `CC', `cache_file', and `prefix'.
`configure' looks for `PREFIX/share/config.site' if it exists, then
`PREFIX/etc/config.site' if it exists. Or, you can set the
`CONFIG_SITE' environment variable to the location of the site script.
A warning: not all `configure' scripts look for a site script.
Defining Variables
==================
Variables not defined in a site shell script can be set in the
environment passed to `configure'. However, some packages may run
configure again during the build, and the customized values of these
variables may be lost. In order to avoid this problem, you should set
them in the `configure' command line, using `VAR=value'. For example:
./configure CC=/usr/local2/bin/gcc
causes the specified `gcc' to be used as the C compiler (unless it is
overridden in the site shell script).
Unfortunately, this technique does not work for `CONFIG_SHELL' due to
an Autoconf bug. Until the bug is fixed you can use this workaround:
CONFIG_SHELL=/bin/bash /bin/bash ./configure CONFIG_SHELL=/bin/bash
`configure' Invocation
======================
`configure' recognizes the following options to control how it
operates.
`--help'
`-h'
Print a summary of all of the options to `configure', and exit.
`--help=short'
`--help=recursive'
Print a summary of the options unique to this package's
`configure', and exit. The `short' variant lists options used
only in the top level, while the `recursive' variant lists options
also present in any nested packages.
`--version'
`-V'
Print the version of Autoconf used to generate the `configure'
script, and exit.
`--cache-file=FILE'
Enable the cache: use and save the results of the tests in FILE,
traditionally `config.cache'. FILE defaults to `/dev/null' to
disable caching.
`--config-cache'
`-C'
Alias for `--cache-file=config.cache'.
`--quiet'
`--silent'
`-q'
Do not print messages saying which checks are being made. To
suppress all normal output, redirect it to `/dev/null' (any error
messages will still be shown).
`--srcdir=DIR'
Look for the package's source code in directory DIR. Usually
`configure' can determine that directory automatically.
`--prefix=DIR'
Use DIR as the installation prefix. *Note Installation Names::
for more details, including other options available for fine-tuning
the installation locations.
`--no-create'
`-n'
Run the configure checks, but stop before creating any output
files.
`configure' also accepts some other, not widely useful, options. Run
`configure --help' for more details.

View File

@ -14,6 +14,17 @@ endif
SUBDIRS += src po tests
doc_DATA = \
AUTHORS \
COPYING \
COPYING.GPLv2 \
NEWS \
README \
THANKS \
TODO \
doc/xz-file-format.txt \
doc/lzma-file-format.txt
EXTRA_DIST = \
version.sh \
Doxyfile.in \

278
PACKAGERS Normal file
View File

@ -0,0 +1,278 @@
Information to packagers of XZ Utils
====================================
0. Preface
1. Package naming
2. Package description
3. License
4. configure options
4.1. Static vs. dynamic linking of liblzma
4.2. Optimizing xzdec and lzmadec
5. Additional documentation
6. Extra files
7. Installing XZ Utils and LZMA Utils in parallel
8. Example
0. Preface
----------
This document is meant for people who create and maintain XZ Utils
packages for operating system distributions. The focus is on GNU/Linux
systems, but most things apply to other systems too.
While the standard "configure && make DESTDIR=$PKG install" should
give a pretty good package, there are some details which packagers
may want to tweak.
Packagers should also read the INSTALL file.
1. Package naming
-----------------
The preferred name for the XZ Utils package is "xz", because that's
the name of the upstream tarball. Naturally you may have good reasons
to use some other name; I won't get angry about it. ;-) It's just nice
to be able to point people to the correct package name without asking
what distro they have.
If your distro policy is to split things into small pieces, here is
one suggestion:
xz xz, xzdec, scripts (xzdiff, xzgrep, etc.), docs
xz-lzma lzma, unlzma, lzcat, lzgrep etc. symlinks and
lzmadec binary for compatibility with LZMA Utils
liblzma liblzma.so.*
liblzma-devel liblzma.so, liblzma.a, API headers
2. Package description
----------------------
Here is a suggestion which you may use as the package description.
If you can use only one-line description, pick only the first line.
Naturally, feel free to use some other description if you find it
better, and maybe send it to me too.
Library and command line tools for XZ and LZMA compressed files
XZ Utils provide a general purpose data compression library
and command line tools. The native file format is the .xz
format, but also the legacy .lzma format is supported. The .xz
format supports multiple compression algorithms, of which LZMA2
is currently the primary algorithm. With typical files, XZ Utils
create about 30 % smaller files than gzip.
If you are splitting XZ Utils into multiple packages, here are some
suggestions for package descriptions:
xz:
Command line tools for XZ and LZMA compressed files
This package includes the xz compression tool and other command
line tools from XZ Utils. xz has command line syntax similar to
that of gzip. The native file format is the .xz format, but also
the legacy .lzma format is supported. The .xz format supports
multiple compression algorithms, of which LZMA2 is currently the
primary algorithm. With typical files, XZ Utils create about 30 %
smaller files than gzip.
Note that this package doesn't include the files needed for
LZMA Utils 4.32.x compatibility. Install also the xz-lzma
package to make XZ Utils emulate LZMA Utils 4.32.x.
xz-lzma:
LZMA Utils emulation with XZ Utils
This package includes executables and symlinks to make
XZ Utils emulate lzma, unlzma, lzcat, and other command
line tools found from the legacy LZMA Utils 4.32.x package.
liblzma:
Library for XZ and LZMA compressed files
liblzma is a general purpose data compression library with
an API similar to that of zlib. liblzma supports multiple
algorithms, of which LZMA2 is currently the primary algorithm.
The native file format is .xz, but also the legacy .lzma
format and raw streams (no headers at all) are supported.
This package includes the shared library.
liblzma-devel:
Library for XZ and LZMA compressed files
This package includes the API headers, static library, and
other development files related to liblzma.
3. License
----------
If the package manager supports a license field, you probably should
put GPLv2+ there (GNU GPL v2 or later). The interesting parts of
XZ Utils are in the public domain, but some less important files
ending up into the binary package are under GPLv2+. So it is simplest
to just say GPLv2+ if you cannot specify "public domain and GPLv2+".
If you split XZ Utils into multiple packages as described earlier
in this file, liblzma and liblzma-dev packages will contain only
public domain code (from XZ Utils at least; compiler or linker may
add some third-party code, which may be copyrighted).
4. configure options
--------------------
Unless you are building a package for a distribution that is meant
only for embedded systems, don't use the following configure options:
--enable-debug
--enable-encoders (*)
--enable-decoders
--enable-match-finders
--enable-checks
--enable-small (*)
--disable-threads (*)
(*) These are OK when building xzdec and lzmadec as explained later.
You may use --enable-werror but be careful with it since it may break
the build due to some useless warning when the build environment
changes (like CPU architecture or compiler version).
4.1. Static vs. dynamic linking of liblzma
The default is to link the command line tools against static liblzma.
This can be changed by passing --enable-dynamic to configure, or by
not building static libraries at all by passing --disable-static to
configure. It is mildly recommended that you use the default and link
the command line tools against static liblzma, but the configure
options make it easy to do otherwise if the distro policy so requires.
On 32-bit x86, linking against static liblzma can give a minor
speed improvement. Static libraries on x86 are usually compiled as
position-dependent code (non-PIC) and shared libraries are built as
position-independent code (PIC). PIC wastes one register, which can
make the code slightly slower compared to a non-PIC version. (Note
that this doesn't apply to x86-64.)
Linking against static liblzma avoids a dependency on liblzma shared
library, and makes it slightly easier to copy the command line tools
between systems (e.g. quick 'n' dirty emergency recovery of some
files). It also allows putting the command line tools to /bin while
leaving liblzma to /usr/lib (assuming that your distribution uses
such a file system hierarchy), if no other file in /bin would require
liblzma.
If you don't want to distribute static libraries but you still
want to link the command line tools against static liblzma, it is
probably easiest to build both static and shared liblzma, but after
"make DESTDIR=$PKG install" remove liblzma.a and modify liblzma.la
to not contain a reference to liblzma.a.
4.2. Optimizing xzdec and lzmadec
xzdec and lzmadec are intended to be relatively small instead of
optimizing for the best speed. Thus, it is a good idea to build
xzdec and lzmadec separately:
- Only decoder code is needed, so you can speed up the build
slightly by passing --disable-encoders to configure. This
shouldn't affect the final size of the executables though,
because the linker is able to omit the encoder code anyway.
- xzdec and lzmadec will never use multithreading capabilities of
liblzma. You can avoid dependency on libpthread by passing
--disable-threads to configure.
- There are and will be no translated messages for xzdec and
lzmadec, so it is fine to pass also --disable-nls to configure.
- To select somewhat size-optimized variant of some things in
liblzma, pass --enable-small to configure.
- Tell the compiler to optimize for size instead of speed.
E.g. with GCC, put -Os into CFLAGS.
5. Additional documentation
---------------------------
"make install" copies some additional documentation to $docdir
(--docdir in configure). These a copy of the GNU GPL v2, which can
be replaced with a symlink if your distro ships with shared copies
of the common license texts.
6. Extra files
--------------
The "extra" directory contains some small extra tools or other files.
The exact set of extra files can vary between XZ Utils releases. The
extra files have only limited use or they are too dangerous to be
put directly to $bindir (7z2lzma.sh is a good example, since it can
silently create corrupt output if certain conditions are not met).
If you feel like it, you may copy the extra directory under the doc
directory (e.g. /usr/share/doc/xz/extra). Maybe some people will find
them useful. However, most people needing these tools probably are
able to find them from the source package too.
The "debug" directory contains some tools that are useful only when
hacking on XZ Utils. Don't package these tools.
7. Installing XZ Utils and LZMA Utils in parallel
-------------------------------------------------
XZ Utils and LZMA Utils 4.32.x can be installed in parallel by
omitting the compatibility symlinks (lzma, unlzma, lzcat, lzgrep etc.)
from the XZ Utils package. It's probably a good idea to still package
the symlinks into a separate package so that users may choose if they
want to use XZ Utils or LZMA Utils for handling .lzma files.
8. Example
----------
Here is an example for i686 GNU/Linux that
- links xz against static liblzma;
- includes only shared liblzma in the final package;
- links xzdec and lzmadec against static liblzma while
avoiding libpthread dependency.
PKG=/tmp/xz-pkg
tar xf xz-x.y.z.tar.gz
cd xz-x.y.z
./configure \
--prefix=/usr \
--sysconfdir=/etc \
CFLAGS='-march=i686 -O2'
make
make DESTDIR=$PKG install-strip
rm -f $PKG/usr/lib/lib*.a
sed -i "s/^old_library=.*$/old_library=''/" $PKG/usr/lib/lib*.la
make clean
./configure \
--prefix=/usr \
--sysconfdir=/etc \
--disable-shared \
--disable-nls \
--disable-encoders \
--enable-small \
--disable-threads \
CFLAGS='-march=i686 -Os'
make -C src/liblzma
make -C src/xzdec
make -C src/xzdec DESTDIR=$PKG install-strip
cp -a extra $PKG/usr/share/doc/xz

269
README
View File

@ -2,89 +2,121 @@
XZ Utils
========
Important
This is a beta version. The .xz file format is now stable though,
which means that files created with the beta version will be
decompressible with all future XZ Utils versions too (assuming
that there are no catastrophic bugs).
liblzma API is pretty stable now, although minor tweaks may still
be done if really needed. The ABI is not stable yet. The major
soname will be bumped right before the first stable release.
Probably it will be bumped to something like .so.5.0.0 because
some distributions using the alpha versions already had to use
other versions than .so.0.0.0.
Excluding the Doxygen style docs in liblzma API headers, the
documentation in this package (including the rest of this
README) is not very up to date, and may contain incorrect or
misleading information.
0. Overview
1. Documentation
1.1. Overall documentation
1.2. Documentation for command line tools
1.3. Documentation for liblzma
2. Version numbering
3. Other implementations of the .xz format
4. Contact information
Overview
0. Overview
-----------
LZMA is a general purpose compression algorithm designed by
Igor Pavlov as part of 7-Zip. It provides high compression ratio
while keeping the decompression speed fast.
XZ Utils provide a general purporse data compression library and
command line tools. The native file format is the .xz format, but
also the legacy .lzma format is supported. The .xz format supports
multiple compression algorithms, which are called "filters" in
context of XZ Utils. The primary filter is currently LZMA2. With
typical files, XZ Utils create about 30 % smaller files than gzip.
XZ Utils are an attempt to make LZMA compression easy to use
on free (as in freedom) operating systems. This is achieved by
providing tools and libraries which are similar to use than the
equivalents of the most popular existing compression algorithms.
To ease adapting support for the .xz format into existing applications
and scripts, the API of liblzma is somewhat similar to the API of the
popular zlib library. For the same reason, the command line tool xz
has similar command line syntax than that of gzip.
XZ Utils consist of a few relatively separate parts:
* liblzma is an encoder/decoder library with support for several
filters (algorithm implementations). The primary filter is LZMA.
* libzfile (or whatever the name will be) enables reading from and
writing to gzip, bzip2 and LZMA compressed and uncompressed files
with an API similar to the standard ANSI-C file I/O.
[ NOTE: libzfile is not implemented yet. ]
* xz command line tool has almost identical syntax than gzip
and bzip2. It makes LZMA easy for average users, but also
provides advanced options to finetune the compression settings.
* A few shell scripts make diffing and grepping LZMA compressed
files easy. The scripts were adapted from gzip and bzip2.
When aiming for the highest compression ratio, LZMA2 encoder uses
a lot of CPU time and may use, depending on the settings, even
hundreds of megabytes of RAM. However, in fast modes, LZMA2 encoder
competes with bzip2 in compression speed, RAM usage, and compression
ratio.
LZMA2 is reasonably fast to decompress. It is a little slower than
gzip, but a lot faster than bzip2. Being fast to decompress means
that the .xz format is especially nice when the same file will be
decompressed very many times (usually on different computers), which
is the case e.g. when distributing software packages. In such
situations, it's not too bad if the compression takes some time,
since that needs to be done only once to benefit many people.
With some file types, combining (or "chaining") LZMA2 with an
additional filter can improve compression ratio. A filter chain may
contain up to four filters, although usually only one two is used.
For example, putting a BCJ (Branch/Call/Jump) filter before LZMA2
in the filter chain can improve compression ratio of executable files.
Since the .xz format allows adding new filter IDs, it is possible that
some day there will be a filter that is, for example, much faster to
compress than LZMA2 (but probably with worse compression ratio).
Similarly, it is possible that some day there is a filter that will
compress better than LZMA2.
XZ Utils doesn't support multithreaded compression or decompression
yet. It has been planned though and taken into account when designing
the .xz file format.
Supported platforms
1. Documentation
----------------
XZ Utils are developed on GNU+Linux, but they should work at
least on *BSDs and Solaris. They probably work on some other
POSIX-like operating systems too.
1.1. Overall documentation
If you use GCC to compile XZ Utils, you need at least version
3.x.x. GCC version 2.xx.x doesn't support some C99 features used
in XZ Utils source code, thus GCC 2 won't compile XZ Utils.
README This file
If you have written patches to make XZ Utils to work on previously
unsupported platform, please send the patches to me! I will consider
including them to the official version. It's nice to minimize the
need of third-party patching.
INSTALL.generic Generic install instructions for those not familiar
with packages using GNU Autotools
INSTALL Installation instructions specific to XZ Utils
PACKAGERS Information to packagers of XZ Utils
One exception: Don't request or send patches to change the whole
source package to C89. I find C99 substantially nicer to write and
maintain. However, the public library headers must be in C89 to
avoid frustrating those who maintain programs, which are strictly
in C89 or C++.
COPYING XZ Utils copyright and license information
COPYING.GPLv2 GNU General Public License version 2
COPYING.GPLv3 GNU General Public License version 3
COPYING.LGPLv2.1 GNU Lesser General Public License version 2.1
AUTHORS The main authors of XZ Utils
THANKS Incomplete list of people who have helped making
this software
NEWS User-visible changes between XZ Utils releases
ChangeLog Detailed list of changes (commit log)
Note that only some of the above files are included in binary
packages.
Platform-specific notes
1.2. Documentation for command line tools
On some Tru64 systems using the native C99 compiler, the configure
script may reject the compiler as non-C99 compiler. This may happen
if there is no stdbool.h available. You can still compile XZ Utils
on such a system by passing ac_cv_prog_cc_c99= to configure script.
Fixing this bug seems to be non-trivial since if the configure
doesn't check for stdbool.h, it runs into problems at least on
Solaris.
The command line tools are documented as man pages. In source code
releases (and possibly also in some binary packages), the man pages
are also provided in plain text (ASCII only) and PDF formats in the
directory "doc/man" to make the man pages more accessible to those
whose operating system doesn't provide an easy way to view man pages.
Version numbering
1.3. Documentation for liblzma
The version number of XZ Utils has absolutely nothing to do with
the version number of LZMA SDK or 7-Zip. The new version number
format of XZ Utils is X.Y.ZS:
The liblzma API headers include short docs about each function
and data type as Doxygen tags. These docs should be quite OK as
a quick reference.
I have planned to write a bunch of very well documented example
programs, which (due to comments) should work as a tutorial to
various features of liblzma. No such example programs have been
written yet.
For now, if you have never used liblzma, libbzip2, or zlib, I
recommend learning *basics* of zlib API. Once you know that, it
should be easier to learn liblzma.
http://zlib.net/manual.html
http://zlib.net/zlib_how.html
2. Version numbering
--------------------
The version number format of XZ Utils is X.Y.ZS:
- X is the major version. When this is incremented, the library
API and ABI break.
@ -109,97 +141,32 @@ Version numbering
the next version can be X.Y.(Z+1)beta but not X.Y.Zbeta.
configure options
3. Other implementations of the .xz format
------------------------------------------
If you are not familiar with `configure' scripts, read the file
INSTALL first.
7-Zip and the p7zip port of 7-Zip support the .xz format starting
from the version 9.00alpha.
In most cases, the default --enable/--disable/--with/--without options
are what you want. Don't touch them if you are unsure.
http://7-zip.org/
http://p7zip.sourceforge.net/
--disable-encoder
Do not compile the encoder component of liblzma. This
implies --disable-match-finders. If you need only
the decoder, you can decrease the library size
dramatically with this option.
XZ Embedded is a limited implementation written for use in the Linux
kernel, but it is also suitable for other embedded use.
The default is to build the encoder.
--disable-decoder
Do not compile the decoder component of liblzma.
The default is to build the decoder.
--enable-filters=
liblzma supports several filters. See liblzma-intro.txt
for a little more information about these.
The default is to build all the filters.
--enable-match-finders=
liblzma includes two categories of match finders:
hash chains and binary trees. Hash chains (hc3 and hc4)
are quite fast but they don't provide the best compression
ratio. Binary trees (bt2, bt3 and bt4) give excellent
compression ratio, but they are slower and need more
memory than hash chains.
You need to enable at least one match finder to build the
LZMA filter encoder. Usually hash chains are used only in
the fast mode, while binary trees are used to when the best
compression ratio is wanted.
The default is to build all the match finders.
--enable-checks=
liblzma support multiple integrity checks. CRC32 is
mandatory, and cannot be omitted. See liblzma-intro.txt
for more information about usage of the integrity checks.
--disable-assembler
liblzma includes some assembler optimizations. Currently
there is only assembler code for CRC32 and CRC64 for
32-bit x86.
All the assembler code in liblzma is position-independent
code, which is suitable for use in shared libraries and
position-independent executables. So far only i386
instructions are used, but the code is optimized for i686
class CPUs. If you are compiling liblzma exclusively for
pre-i686 systems, you may want to disable the assembler
code.
--enable-small
Omits precomputed tables. This makes liblzma a few KiB
smaller. Startup time increases, because the tables need
to be computed first.
--enable-debug
This enables the assert() macro and possibly some other
run-time consistency checks. It slows down things somewhat,
so you normally don't want to have this enabled.
--enable-werror
Makes all compiler warnings an error, that abort the
compilation. This may help catching bugs, and should work
on most systems. This has no effect on the resulting
binaries.
http://tukaani.org/xz-embedded/
Static vs. dynamic linking of the command line tools
4. Contact information
----------------------
By default, the command line tools are linked statically against
liblzma. There a are a few reasons:
If you have questions, bug reports, patches etc. related to XZ Utils,
contact Lasse Collin <lasse.collin@tukaani.org>. tukaani.org uses
greylisting to reduce spam, thus when you send your first email, it
may get delayed by a few hours. In addition to that, I'm sometimes
slow at replying. If you haven't got a reply within two weeks, assume
that your email has got lost and resend it or use IRC.
- The executable(s) can be in /bin while the shared liblzma can still
be in /usr/lib (if the distro uses such file system hierarchy).
- It's easier to copy the executables to other systems, since they
depend only on libc.
- It's slightly faster on some architectures like x86.
If you don't like this, you can get the command line tools linked
against the shared liblzma by specifying --disable-static to configure.
This disables building static liblzma completely.
You can find me also from #tukaani on Freenode; my nick is Larhzu.
The channel tends to be pretty quiet, so just ask your question and
someone may wake up.

17
THANKS
View File

@ -1,11 +1,11 @@
Thanks
------
======
Some people have helped more, some less, some don't even know they have
been helpful, but nevertheless everyone's help has been important. :-)
In alphabetical order:
Some people have helped more, some less, but nevertheless everyone's help
has been important. :-) In alphabetical order:
- Mark Adler
- H. Peter Anvin
- Nelson H. F. Beebe
- Anders F. Björklund
- Emmanuel Blot
@ -13,7 +13,6 @@ In alphabetical order:
- Andrew Dudman
- İsmail Dönmez
- Mike Frysinger
- Jean-loup Gailly
- Per Øyvind Karlsen
- Ville Koskinen
- Stephan Kulow
@ -26,7 +25,6 @@ In alphabetical order:
- Bernhard Reutner-Fischer
- Alexandre Sauvé
- Andreas Schwab
- Julian Seward
- Dan Shechter
- Paul Townsend
- Mohammed Adnène Trojette
@ -34,8 +32,11 @@ In alphabetical order:
- Bert Wesarg
- Ralf Wildenhues
- Charles Wilson
- Lars Wirzenius
- Andreas Zieringer
Also thanks to all the people who have participated the Tukaani project
and others who I have forgot.
Also thanks to all the people who have participated in the Tukaani project.
I have probably forgot to add some names to the above list. Sorry about
that and thanks for your help.