mirror of
				https://git.tukaani.org/xz.git
				synced 2025-10-31 13:32:56 +00:00 
			
		
		
		
	Updated faq.txt.
Some questions worth answering were removed, because I currently don't have good up to date answers to them.
This commit is contained in:
		
							parent
							
								
									fe111a25cd
								
							
						
					
					
						commit
						b198e770a1
					
				
							
								
								
									
										239
									
								
								doc/faq.txt
									
									
									
									
									
								
							
							
						
						
									
										239
									
								
								doc/faq.txt
									
									
									
									
									
								
							| @ -2,185 +2,96 @@ | |||||||
| XZ Utils FAQ | XZ Utils FAQ | ||||||
| ============ | ============ | ||||||
| 
 | 
 | ||||||
| Q:  What are LZMA, LZMA Utils, lzma, .lzma, liblzma, LZMA SDK, LZMA_Alone, | Q:  What do the letters XZ mean? | ||||||
|     7-Zip and p7zip? |  | ||||||
| 
 | 
 | ||||||
| A:  LZMA stands for Lempel-Ziv-Markov chain-Algorithm. LZMA is the name | A:  Nothing. They are just two letters, which come from the file format | ||||||
|     of the compression algorithm designed by Igor Pavlov. He is the author |     suffix .xz. The .xz suffix was selected, because it seemed to be | ||||||
|     of 7-Zip, which is a great LGPL'd compression tool for Microsoft |     pretty much unused. It is no deeper meaning. | ||||||
|     Windows operating systems. In addition to 7-Zip itself, also LZMA SDK |  | ||||||
|     is available on the website of 7-Zip. LZMA SDK contains LZMA |  | ||||||
|     implementations in C++, Java and C#. The C++ version is the original |  | ||||||
|     implementation which is used also in 7-Zip itself. |  | ||||||
| 
 |  | ||||||
|     Excluding the unrar plugin, 7-Zip is free software (free as in |  | ||||||
|     freedom). Thanks to this, it was possible to port it to POSIX |  | ||||||
|     platforms. The port was done and is maintained by myspace (TODO: |  | ||||||
|     myspace's real name?). p7zip is a port of 7-Zip's command line version; |  | ||||||
|     p7zip doesn't include the 7-Zip's GUI. |  | ||||||
| 
 |  | ||||||
|     In POSIX world, users are used to gzip and bzip2 command line tools. |  | ||||||
|     Developers know APIs of zlib and libbzip2. LZMA Utils try to ease |  | ||||||
|     adoption of LZMA on free operating systems by providing a compression |  | ||||||
|     library and a set of command line tools. The library is called liblzma. |  | ||||||
|     It provides a zlib-like API making it easy to adapt LZMA compression in |  | ||||||
|     existing applications. The main command line tool is known as lzma, |  | ||||||
|     whose command line syntax is very similar to that of gzip and bzip2. |  | ||||||
| 
 |  | ||||||
|     The original command line tool from LZMA SDK (lzma.exe) was found from |  | ||||||
|     a directory called LZMA_Alone in the LZMA SDK. It used a simple header |  | ||||||
|     format in .lzma files. This format was also used by LZMA Utils up to |  | ||||||
|     and including 4.32.x. In LZMA Utils documentation, LZMA_Alone refers |  | ||||||
|     to both the file format and the command line tool from LZMA SDK. |  | ||||||
| 
 |  | ||||||
|     Because of various limitations of the LZMA_Alone file format, a new |  | ||||||
|     file format was developed. Extending some existing format such as .gz |  | ||||||
|     used by gzip was considered, but these formats were found to be too |  | ||||||
|     limited. The filename suffix for the new .lzma format is `.lzma'. The |  | ||||||
|     same suffix is also used for files in the LZMA_Alone format. To make |  | ||||||
|     the transition to the new format as transparent as possible, LZMA Utils |  | ||||||
|     support both the new and old formats transparently. |  | ||||||
| 
 |  | ||||||
|     7-Zip and LZMA SDK: <http://7-zip.org/> |  | ||||||
|     p7zip: <http://p7zip.sourceforge.net/> |  | ||||||
|     LZMA Utils: <http://tukaani.org/lzma/> |  | ||||||
| 
 | 
 | ||||||
| 
 | 
 | ||||||
| Q:  What LZMA implementations there are available? | Q:  What are LZMA and LZMA2? | ||||||
| 
 | 
 | ||||||
| A:  LZMA SDK contains implementations in C++, Java and C#. The C++ version | A:  LZMA stands for Lempel-Ziv-Markov chain-Algorithm. It is the name | ||||||
|     is the original implementation which is part of 7-Zip. LZMA SDK |     of the compression algorithm designed by Igor Pavlov for 7-Zip. | ||||||
|     contains also a small LZMA decoder in C. |     LZMA is based on LZ77 and range encoding. | ||||||
| 
 | 
 | ||||||
|     A port of LZMA SDK to Pascal was made by Alan Birtles |     LZMA2 is an updated version of the original LZMA to fix a couple of | ||||||
|     <http://www.birtles.org.uk/programming/>. It should work with |     practical issues. In context of XZ Utils, LZMA is called LZMA1 to | ||||||
|     multiple Pascal programming language implementations. |     emphasize that LZMA is not the same thing as LZMA2. LZMA2 is the | ||||||
| 
 |     primary compression algorithm in the .xz file format. | ||||||
|     LZMA Utils includes liblzma, which is directly based on LZMA SDK. |  | ||||||
|     liblzma is written in C (C99, not C89). In contrast to C++ callback |  | ||||||
|     API used by LZMA SDK, liblzma uses zlib-like stateful C API. I do not |  | ||||||
|     want to comment whether both/former/latter/neither API(s) are good or |  | ||||||
|     bad. The only reason to implement a zlib-like API was, that many |  | ||||||
|     developers are already familiar with zlib, and very many applications |  | ||||||
|     already use zlib. Having a similar API makes it easier to include LZMA |  | ||||||
|     support in existing applications. |  | ||||||
| 
 |  | ||||||
|     See also <http://en.wikipedia.org/wiki/LZMA#External_links>. |  | ||||||
| 
 | 
 | ||||||
| 
 | 
 | ||||||
| Q:  Which file formats are supported by LZMA Utils? | Q:  There are many LZMA related projects. How does XZ Utils relate to them? | ||||||
| 
 | 
 | ||||||
| A:  Even when the raw LZMA stream is always the same, it can be wrapped | A:  7-Zip and LZMA SDK are the original projects. LZMA SDK is roughly | ||||||
|     in different container formats. The preferred format is the new .lzma |     a subset of the 7-Zip source tree. | ||||||
|     format. It has magic bytes (the first six bytes: 0xFF 'L' 'Z' 'M' |  | ||||||
|     'A' 0x00). The format supports chaining up to seven filters, splitting |  | ||||||
|     data to multiple blocks for easier multi-threading and rough |  | ||||||
|     random-access reading. The file integrity is verified using CRC32, |  | ||||||
|     CRC64, or SHA256, and by verifying the uncompressed size of the file. |  | ||||||
| 
 | 
 | ||||||
|     LZMA SDK includes a tool called LZMA_Alone. It supports uses a |     p7zip is 7-Zip's command line tools ported to POSIX-like systems. | ||||||
|     primitive header which includes only the mandatory stream information |  | ||||||
|     required by the LZMA decoder. This format can be both read and |  | ||||||
|     written by liblzma and the command line tool (use --format=alone to |  | ||||||
|     create such files). |  | ||||||
| 
 | 
 | ||||||
|     .7z is the native archive format used by 7-Zip. This format is not |     LZMA Utils provide a gzip-like lzma tool for POSIX-like systems. | ||||||
|     supported by liblzma, and probably will never be supported. You |     LZMA Utils are based on LZMA SDK. XZ Utils are the successor to | ||||||
|     should use e.g. p7zip to extract .7z files. |     LZMA Utils. | ||||||
| 
 | 
 | ||||||
|     It is possible to implement custom file formats by using raw filter |     There are several other projects using LZMA. Most are more or less | ||||||
|     mode in liblzma. In this mode the application needs to store the filter |     based on LZMA SDK. | ||||||
|     properties and provide them to liblzma before starting to uncompress |  | ||||||
|     the data. |  | ||||||
| 
 | 
 | ||||||
| 
 | 
 | ||||||
| Q:  How can I identify files containing LZMA compressed data? | Q:  Do XZ Utils support the .7z format? | ||||||
| 
 | 
 | ||||||
| A:  The preferred filename suffix for .lzma files is `.lzma'. `.tar.lzma' | A:  No. Use 7-Zip (Windows) or p7zip (POSIX-like systems) to handle .7z | ||||||
|     may be abbreviated to `.tlz'. The same suffixes are used for files in |     files. | ||||||
|     LZMA_Alone format. In practice this should be no problem since tools |  | ||||||
|     included in LZMA Utils support both formats transparently. |  | ||||||
| 
 |  | ||||||
|     Checking the magic bytes is easy way to detect files in the new .lzma |  | ||||||
|     format (the first six bytes: 0xFF 'L' 'Z' 'M' 'A' 0x00). The "file" |  | ||||||
|     command version FIXME contains magic strings for this format. |  | ||||||
| 
 |  | ||||||
|     The old LZMA_Alone format has no magic bytes. Its header cannot contain |  | ||||||
|     arbitrary bytes, thus it is possible to make a guess. Unfortunately the |  | ||||||
|     guessing is usually too hard to be reliable, so don't try it unless you |  | ||||||
|     are desperate. |  | ||||||
| 
 | 
 | ||||||
| 
 | 
 | ||||||
| Q:  Does the lzma command line tool support sparse files? | Q:  I have many .tar.7z files. Can I convert them to .tar.xz without | ||||||
|  |     spending hours recompressing the data? | ||||||
| 
 | 
 | ||||||
| A:  Sparse files can (of course) be compressed like normal files, but | A:  In the "extra" directory, there is a script named 7z2lzma.bash which | ||||||
|     uncompression will not restore sparseness of the file. Use an archiver |     is able to convert some .7z files to the .lzma format (not .xz). It | ||||||
|     tool to take care of sparseness before compressing the data with lzma. |     needs the 7za (or 7z) command from p7zip. The script may silently | ||||||
| 
 |     produce corrupt output if certain assumptions are not met, so | ||||||
|     The reason for this is that archiver tools handle files, while |     decompress the resulting .lzma file and compare it against the | ||||||
|     compression tools handle streams or buffers. Being a sparse file is |     original before deleting the original file! | ||||||
|     a property of the file on the disk, not a property of the stream or |  | ||||||
|     buffer. |  | ||||||
| 
 | 
 | ||||||
| 
 | 
 | ||||||
| Q:  Can I recover parts of a broken LZMA file (e.g. corrupted CD-R)? | Q:  I have many .lzma files. Can I quickly convert them to the .xz format? | ||||||
| 
 | 
 | ||||||
| A:  With LZMA_Alone and single-block .lzma files, you can uncompress the | A:  For now, no. Since XZ Utils supports the .lzma format, it's usually | ||||||
|     file until you hit the first broken byte. The data after the broken |     not too bad to keep the old files in the old format. If you want to | ||||||
|     position is lost. LZMA relies on the uncompression history, and if |     do the conversion anyway, you need to decompress the .lzma files and | ||||||
|     bytes are missing in the middle of the file, it is impossible to |     then recompress to the .xz format. | ||||||
|     reliably continue after the broken section. |  | ||||||
| 
 | 
 | ||||||
|     With multi-block .lzma files it may be possible to locale the next |     Technically, there is a way to make the conversion relatively fast | ||||||
|     block in the file and continue decoding there. A limited recovery |     (roughly twice the time that normal decompression takes). Writing | ||||||
|     tool for this kind of situations is planned. |     such a tool would take quite a bit time though, and would probably | ||||||
|  |     be useful to only a few people. If you really want such a conversion | ||||||
|  |     tool, contact Lasse Collin and offer some money. | ||||||
| 
 | 
 | ||||||
| 
 | 
 | ||||||
| Q:  Is LZMA patented? | Q:  Can I recover parts of a broken .xz file (e.g. corrupted CD-R)? | ||||||
| 
 | 
 | ||||||
| A:  No, the authors are not aware of any patents that could affect LZMA. | A:  It may be possible if the file consist of multiple blocks, which | ||||||
|     However, due to nature of software patents, the authors cannot |     typically is not the case if the file was created in single-threaded | ||||||
|     guarantee, that LZMA isn't affected by any third party patent. |     mode. There is no recovery program yet. | ||||||
| 
 | 
 | ||||||
| 
 | 
 | ||||||
| Q:  Where can I find documentation about how LZMA works as an algorithm? | Q:  Is (some part of) XZ Utils patented? | ||||||
| 
 | 
 | ||||||
| A:  Read the source code, Luke. There is no documentation about LZMA | A:  Lasse Collin is not aware of any patents that could affect XZ Utils. | ||||||
|     internals. It is possible that Igor Pavlov is the only person on |     However, due to nature of software patents, it's not possible to | ||||||
|     the Earth that completely knows and understands the algorithm. |     guarantee that XZ Utils isn't affected by any third party patent(s). | ||||||
| 
 |  | ||||||
|     You could begin by downloading LZMA SDK, and start reading from |  | ||||||
|     the LZMA decoder to get some idea about the bitstream format. |  | ||||||
|     Before you begin, you should know the basics of LZ77 and |  | ||||||
|     range coding algorithms. LZMA is based on LZ77, but LZMA is |  | ||||||
|     *a lot* more complex. Range coding is used to compress the |  | ||||||
|     final bitstream like Huffman coding is used in Deflate. |  | ||||||
| 
 | 
 | ||||||
| 
 | 
 | ||||||
| Q:  What are filters? | Q:  Where can I find documentation about the file format and algorithms? | ||||||
| 
 | 
 | ||||||
| A:  In context of .lzma files, a filter means an implementation of a | A:  The .xz format is documented in xz-file-format.txt. It is a container | ||||||
|     compression algorithm. The primary filter is LZMA, which is why |     format only, and doesn't include descriptions of any non-trivial | ||||||
|     the names of the tools contain the letters LZMA. |     filters. | ||||||
| 
 | 
 | ||||||
|     liblzma and the new .lzma format support also other filters than LZMA. |     Documenting LZMA and LZMA2 is planned, but for now, there is no other | ||||||
|     There are different types of filters, which are suitable for different |     documentation that the source code. Before you begin, you should know | ||||||
|     types of data. Thus, to select the optimal filter and settings, the |     the basics of LZ77 and range coding algorithms. LZMA is based on LZ77, | ||||||
|     type of the input data being compressed needs to be known. |     but LZMA is *a lot* more complex. Range coding is used to compress | ||||||
| 
 |     the final bitstream like Huffman coding is used in Deflate. | ||||||
|     Some filters are most useful when combined with another filter like |  | ||||||
|     LZMA. These filters increase redundancy in the data, without changing |  | ||||||
|     the size of the data, by taking advantage of properties specific to |  | ||||||
|     the data being compressed. |  | ||||||
| 
 |  | ||||||
|     So far, all the filters are always reversible. That is, no matter what |  | ||||||
|     data you pass to a filter encoder, it can be always defiltered back to |  | ||||||
|     the original form. Because of this, it is safe to compress for example |  | ||||||
|     a software package that contains other file types than executables |  | ||||||
|     using a filter specific to the architechture of the package being |  | ||||||
|     compressed. |  | ||||||
| 
 |  | ||||||
|     The old LZMA_Alone format supports only the LZMA filter. |  | ||||||
| 
 | 
 | ||||||
| 
 | 
 | ||||||
| Q:  I cannot find BCJ and BCJ2 filters. Don't they exist in liblzma? | Q:  I cannot find BCJ and BCJ2 filters. Don't they exist in liblzma? | ||||||
| @ -189,27 +100,23 @@ A:  BCJ filter is called "x86" in liblzma. BCJ2 is not included, | |||||||
|     because it requires using more than one encoded output stream. |     because it requires using more than one encoded output stream. | ||||||
| 
 | 
 | ||||||
| 
 | 
 | ||||||
| Q:  Can I use LZMA in proprietary, non-free applications? | Q:  How do I build a program that needs liblzmadec (lzmadec.h)? | ||||||
| 
 | 
 | ||||||
| A:  Yes. See the file COPYING for details. | A:  liblzmadec is part of LZMA Utils. XZ Utils has liblzma, but no | ||||||
|  |     liblzmadec. The code using liblzmadec should be ported to use | ||||||
|  |     liblzma instead. If you cannot or don't want to do that, download | ||||||
|  |     LZMA Utils from <http://tukaani.org/lzma/>. | ||||||
| 
 | 
 | ||||||
| 
 | 
 | ||||||
| Q:  I would like to help. What can I do? | Q:  The default build of liblzma is too big. How can I make it smaller? | ||||||
| 
 | 
 | ||||||
| A:  See the TODO file. Please contact Lasse Collin before starting to do | A:  Give --enable-small to the configure script. Use also appropriate | ||||||
|     anything, because it is possible that someone else is already working |     --enable or --disable options to include only those filter encoders | ||||||
|     on the same thing. |     and decoders and integrity checks that you actually need. Use | ||||||
|  |     CFLAGS=-Os (with GCC) or equivalent to tell your compiler to optimize | ||||||
|  |     for size. See INSTALL for information about configure options. | ||||||
| 
 | 
 | ||||||
| 
 |     If the result is still too big, take a look at XZ Embedded. It is | ||||||
| Q:  How can I contact the authors? |     a separate project, which provides a limited but signinificantly | ||||||
| 
 |     smaller XZ decoder implementation than XZ Utils. | ||||||
| A:  Lasse Collin is the maintainer of LZMA Utils. You can contact him |  | ||||||
|     either via IRC (Larhzu on #tukaani at Freenode or IRCnet). Email |  | ||||||
|     should work too, <lasse.collin@tukaani.org>. |  | ||||||
| 
 |  | ||||||
|     Igor Pavlov is the father of LZMA. He is the author of 7-Zip |  | ||||||
|     and LZMA SDK. <http://7-zip.org/> |  | ||||||
| 
 |  | ||||||
|     NOTE: Please don't bother Igor Pavlov with questions specific |  | ||||||
|     to LZMA Utils. |  | ||||||
| 
 | 
 | ||||||
|  | |||||||
		Loading…
	
	
			
			x
			
			
		
	
		Reference in New Issue
	
	Block a user