root/xz - xz - Root on GIT

root/xz

mirror of https://git.tukaani.org/xz.git synced 2026-04-08 00:58:00 +00:00

Author	SHA1	Message	Date
Lasse Collin	f8c328eed1	Windows: Workaround a UTF-8 issue in Gettext's libintl_setlocale() See the comment. In this package, locale is set at program startup and not changed later, so the point (2) in the comment isn't a problem. Fixes: 46ee0061629fb075d61d83839e14dd193337af59	2024-12-20 16:33:34 +02:00
Lasse Collin	0353390609	Revert "Windows: Use UTF-8 locale when active code page is UTF-8" This reverts commit 0d0b574cc45045d6150d397776340c068df59e2a.	2024-12-20 16:33:34 +02:00
Lasse Collin	4b319e05af	xzdec: Use setlocale() instead of tuklib_gettext_setlocale() xzdec isn't translated and doesn't need libintl on Windows even when NLS is enabled, thus libintl_setlocale() cannot interfere with the locale settings. Thus, standard setlocale() works perfectly. In the commit 78868b6e, the explanation in the commit message is wrong. Fixes: 78868b6ed63fa4c89f73e3dfed27abfb8b0d46db	2024-12-20 16:33:34 +02:00
Lasse Collin	34b80e282e	Windows: Revert the setlocale(LC_ALL, ".UTF8") documentation Only leave the FindFileFirstA() notes from 20dfca81, reverting the incorrect setlocale() notes. On Windows, Gettext's <libintl.h> overrides setlocale() with libintl_setlocale() wrapper. I hadn't noticed this, and thus my conclusions were wrong. Fixes: 20dfca8171dad4c64785ac61d5b68972c444877b	2024-12-20 16:33:28 +02:00
Lasse Collin	5794cda064	tuklib_mbstr_wrap: Silence a warning from Clang Fixes: ca529c3f41a4a19a59e2e252e6dd9255f130c634	2024-12-18 17:50:58 +02:00
Lasse Collin	16c9796ef9	Update THANKS	2024-12-18 17:09:32 +02:00
Lasse Collin	3b5c8a1fca	Update TODO Fixes: 5f6dddc6c911df02ba660564e78e6de80947c947	2024-12-18 17:09:32 +02:00
Lasse Collin	22a35e64ce	lzmainfo: Use tuklib_mbstr_nonprint	2024-12-18 17:09:32 +02:00
Lasse Collin	03111595ee	xzdec: Use tuklib_mbstr_nonprint	2024-12-18 17:09:32 +02:00
Lasse Collin	d22f96921f	xz: Use tuklib_mbstr_nonprint Call tuklib_mask_nonprint() on filenames and also on a few other strings from the command line too. The filename printed by "xz --robot --list" (in list.c) is also masked. It's good to get rid of tabs and newlines which would desync the output but masking other chars wouldn't be strictly necessary. It might matter with sensible filenames if LC_CTYPE is "C" (when iswprint() might reject non-ASCII chars) and a script wants to read a filename from xz's output. Hopefully it's an unusual enough corner case to not be a real problem.	2024-12-18 17:09:32 +02:00
Lasse Collin	40e5733055	Add tuklib_mbstr_nonprint to mask non-printable characters Malicious filenames or other untrusted strings may affect the state of the terminal when such strings are printed as part of (error) messages. Add functions that mask such characters. It's not enough to handle only single-byte control characters. In multibyte locales, some control characters are multibyte too, for example, terminals interpret C1 control characters (U+0080 to U+009F) that are two bytes as UTF-8. Instead of checking for control characters with iswcntrl(), this uses iswprint() to detect printable characters. This is much stricter. On Windows it's actually too strict as it rejects some characters that definitely are printable. Gnulib's quotearg would do a lot more but I hope this simpler method is good enough here. Thanks to Ryan Colyer for the discussion about the problems of the earlier single-byte-only method. Thanks to Christian Weisgerber for reporting a bug in an earlier version of this code. Thanks to Jeroen Roovers for a typo fix. Closes: https://github.com/tukaani-project/xz/pull/118	2024-12-18 17:09:32 +02:00
Lasse Collin	36190c8c4b	Translations: Add preliminary Georgian translation Most of the auto-wrapped strings are translated already. A few strings have changed since this was created though. This file isn't in the Translation Project yet because these strings are still very new. Closes: https://github.com/tukaani-project/xz/pull/145	2024-12-18 17:09:31 +02:00
Lasse Collin	4a0c4f92b8	xz: Make one string simpler for translators Leading spaces in the string can get miscounted by translators.	2024-12-18 17:09:31 +02:00
Lasse Collin	3fcf547e92	lzmainfo: Sync the translatable strings with xz	2024-12-18 17:09:31 +02:00
Lasse Collin	3e9177fd20	xz: Use automatic word wrapping for help texts --long-help is now one line longer because --lzma1 is now on its own line.	2024-12-18 17:09:31 +02:00
Lasse Collin	a0eecc9eb2	po/Makevars: Add --keyword=W_:... to XGETTEXT_OPTIONS The text was copied from tuklib_gettext.h. Also rearrange the --keyword options to be last on the line.	2024-12-18 17:09:31 +02:00
Lasse Collin	ca529c3f41	Add tuklib_mbstr_wrap for automatic word wrapping Automatic word wrapping makes translators' work easier and reduces errors like misaligned columns or overlong lines. Right-to-left languages and languages that don't use spaces between words will still need extra effort. (xz hasn't been translated to any RTL language so far.)	2024-12-18 17:09:31 +02:00
Lasse Collin	314b83ceba	Build: Sort filenames to ASCII order in Makefile.am	2024-12-18 17:09:31 +02:00
Lasse Collin	df399c5255	tuklib_mbstr_width: Add tuklib_mbstr_width_mem() It's a new function split from tuklib_mbstr_width(). It's useful with partial strings that aren't terminated with \0.	2024-12-18 17:09:30 +02:00
Lasse Collin	51081efae4	tuklib_mbstr_width: Update a comment about shift states	2024-12-18 17:09:30 +02:00
Lasse Collin	7ff1b0ac53	tuklib_mbstr_width: Don't mention shift states in the API docs It is assumed that this code won't be used with charsets that use locking shift states.	2024-12-18 17:09:30 +02:00
Lasse Collin	3c16105936	tuklib_mbstr_width: Use stricter return value checking This should make no difference in practice (at least if mbrtowc() isn't broken).	2024-12-18 17:09:30 +02:00
Lasse Collin	b797c44c42	tuklib_mbstr_width: Change the behavior when wcwidth() is not available If wcwidth() isn't available (Windows), previously it was assumed that one byte == one column in the terminal. Now it is assumed that one multibyte character == one column. This works better with UTF-8. Languages that only use single-width characters without any combining characters should work correctly with this. In xz, none of po/*.po contain combining characters and only ko.po, zh_CN.po, and zh_TW.po contain fullwidth characters. Thus, "only" those three translations in xz are broken on Windows with the UTF-8 code page. Broken means that column headings in xz -lvv and (only in the master branch) strings in --long-help are misaligned, so it's not a huge problem. I don't know if those three languages displayed perfectly before the UTF-8 change because I hadn't tested translations with native Windows builds before. Fixes: 46ee0061629fb075d61d83839e14dd193337af59	2024-12-18 17:09:30 +02:00
Lasse Collin	78868b6ed6	xzdec: Use setlocale() via tuklib_gettext_setlocale() xzdec isn't translated and didn't have locale-specific behavior in the past. On Windows with UTF-8 in the application manifest, setting the locale makes a difference though: - Without any setlocale() call, non-ASCII filenames don't display properly in Command Prompt unless one first uses "chcp 65001" to set the console code page to UTF-8. - setlocale(LC_ALL, "") is enough to make non-ASCII filenames print correctly in Command Prompt without using "chcp 65001", assuming that the non-UTF-8 code page (like 850) supports those non-ASCII characters. - setlocale(LC_ALL, ".UTF8") is even better because then mbrtowc() and such functions use an UTF-8 locale instead of a legacy code page. The tuklib_gettext_setlocale() macro takes care of this (without enabling any translations). Fixes: 46ee0061629fb075d61d83839e14dd193337af59	2024-12-18 17:09:30 +02:00
Lasse Collin	0d0b574cc4	Windows: Use UTF-8 locale when active code page is UTF-8 XZ Utils 5.6.3 set the active code page to UTF-8 to fix CVE-2024-47611. This wasn't paired with UCRT-specific setlocale(LC_ALL, ".UTF8"), thus non-ASCII characters from translations became mojibake. Fixes: 46ee0061629fb075d61d83839e14dd193337af59	2024-12-18 17:09:30 +02:00
Lasse Collin	20dfca8171	Windows: Document the need for setlocale(LC_ALL, ".UTF8") Also warn about unpaired surrogates and (somewhat UTF-8-specific) MAX_PATH issue in FindFirstFileA(). Fixes: 46ee0061629fb075d61d83839e14dd193337af59	2024-12-18 17:09:29 +02:00
Lasse Collin	4e936f2340	xzdec: Call tuklib_progname_init() early enough If the early pledge() call on OpenBSD fails, it calls my_errorf() which requires the "progname" variable. Fixes: d74fb5f060b76db709b50f5fd37490394e52f975	2024-12-18 17:09:29 +02:00
Lasse Collin	61feaf681b	CMake: Bump maximum policy version to 3.31 With CMake 3.31, there were a few warnings from CMP0177 "install() DESTINATION paths are normalized". These occurred because the install(FILES) command in my_install_man_lang() is called with a DESTINATION path that contains two consecutive slashes, for example, "share/man//man1". Such a path is for the English man pages. With translated man pages, the language code goes between the slashes. The warning was probably triggered because the extra slash gets removed by the normalization.	2024-12-18 17:09:29 +02:00
Lasse Collin	b0bb84dd7b	Update THANKS	2024-12-18 17:09:29 +02:00
Dexter Castor Döpping	bee0c044d3	liblzma: Fix incorrect macro name in a comment Fixes: 33b8a24b6646a9dbfd8358405aec466b13078559 Closes: https://github.com/tukaani-project/xz/pull/155	2024-12-18 17:09:29 +02:00
Lasse Collin	2cfa1ad0a9	license-check.sh: Add an exception for doc/SHA256SUMS Fixes: 36b531022f24a2ab57a2dfb9e5052f1c176e9d9a	2024-12-18 17:09:21 +02:00
Lasse Collin	36b531022f	doc/SHA256SUMS: Add the list of SHA-256 hashes of release files The release files are signed but verifying the signatures cannot catch certain types of attacks: 1. A malicious maintainer could make more than one variant of a package. One could be for general distribution. Another with malicious content could be targeted to specific users, for example, distributing the malicious version on a mirror controlled by the attacker. 2. If the signing key of an honest maintainer was compromised without being detected, a similar situation as described above could occur. SHA256SUMS could be put on the project website but having it in the Git repository makes it obvious that old lines aren't modified when the file is updated. Hashes of uncompressed files are included too. This way tarballs can be recompressed and the hashes can still be verified.	2024-12-01 21:38:17 +02:00
Lasse Collin	fe9e66993f	Docs: Remove .github/SECURITY.md One of the reasons to have this file in the xz repository was to show vulnerability reporting info in the Security section on GitHub. On 2024-11-25, I added SECURITY.md to the tukaani-project organization on GitHub: https://github.com/tukaani-project/.github/blob/main/SECURITY.md GitHub shows that file in all projects in the organization unless overridden by a project-specific SECURITY.md. Thus, removing the file from the xz repo makes GitHub show the organization-wide text instead. Maintaining a single copy for the whole GitHub organization makes things simpler. It's also nicer to have fewer GitHub-specific files in the xz repo. Information how to report bugs (including security issues) is available in README and on the home page too. The OpenSSF Scorecard tool didn't find .github/SECURITY.md from the xz repository. There was a suggestion to move the file to the top-level directory where Scorecard should find it. However, Scorecard does find the organization-wide SECURITY.md. Thus, the file isn't needed in the xz repository to score points in the Scorecard game: https://scorecard.dev/viewer/?uri=github.com/tukaani-project/xz Closes: https://github.com/tukaani-project/xz/issues/148 Closes: https://github.com/tukaani-project/xz/pull/149	2024-11-30 12:05:59 +02:00
Lasse Collin	b361772736	Translations: Update the Chinese (traditional) translation	2024-11-30 10:27:14 +02:00
Lasse Collin	c15115f7ed	liblzma: Optimize the loop conditions in BCJ filters Compilers cannot optimize the addition "i + 4" away since theoretically it could overflow.	2024-11-26 19:17:42 +02:00
Lasse Collin	9f69e71e78	Update THANKS	2024-11-25 16:26:54 +02:00
Mark Wielaard	48ff3f0652	xz: Landlock: Fix a file descriptor leak	2024-11-25 12:28:44 +02:00
Sam James	dbca3d078e	CI: update FreeBSD, NetBSD, OpenBSD, Solaris actions Checked the changes and they're all innocuous. This should hopefully fix the "externally managed" pip error in these jobs that started recently.	2024-10-02 10:10:54 +03:00
Lasse Collin	a94b85bea3	Add NEWS for 5.6.3	2024-10-01 20:06:54 +03:00
Lasse Collin	be4bf94446	cmake/tuklib_large_file_support.cmake: Add a missing include v5.2 didn't build with CMake. Other branches had include(CMakePushCheckState) in top-level CMakeLists.txt which made the build work. Fixes: 597f49b61475438a43a417236989b2acc968a686	2024-10-01 14:49:41 +03:00
Lasse Collin	1ebbe915d4	Update THANKS	2024-10-01 12:10:23 +03:00
Lasse Collin	74702ee00e	Tests/Windows: Add the application manifest to the test programs This ensures that the test programs get executed the same way as the binaries that are installed.	2024-10-01 12:10:23 +03:00
Lasse Collin	7ddf2273e0	license-check.sh: Add an exception for w32_application.manifest The file gets embedded as is into executables, thus it cannot hold a license identifier.	2024-10-01 12:10:23 +03:00
Lasse Collin	46ee006162	Windows: Embed an application manifest in the EXE files IMPORTANT: This includes a security fix to command line tool argument handling. Some toolchains embed an application manifest by default to declare UAC-compliance. Some also declare compatibility with Vista/8/8.1/10/11 to let the app access features newer than those of Vista. We want all the above but also two more things: - Declare that the app is long path aware to support paths longer than 259 characters (this may also require a registry change). - Force the code page to UTF-8. This allows the command line tools to access files whose names contain characters that don't exist in the current legacy code page (except unpaired surrogates). The UTF-8 code page also fixes security issues in command line argument handling which can be exploited with malicious filenames. See the new file w32_application.manifest.comments.txt. Thanks to Orange Tsai and splitline from DEVCORE Research Team for discovering this issue. Thanks to Vijay Sarvepalli for reporting the issue to me. Thanks to Kelvin Lee for testing with MSVC and helping with the required build system fixes.	2024-10-01 12:10:23 +03:00
Lasse Collin	dad1530915	Windows: Set DLL name accurately in StringFileInfo on Cygwin and MSYS2 Now the information in the "Details" tab in the file properties dialog matches the naming convention of Cygwin and MSYS2. This is only a cosmetic change.	2024-09-30 16:55:23 +03:00
Lasse Collin	8940ecb96f	common_w32res.rc: White space edits LANGUAGE and VS_VERSION_INFO begin new statements so put an empty line between them.	2024-09-29 01:27:16 +03:00
Lasse Collin	c3b9dad07d	CMake: Add the resource files to the Cygwin and MSYS2 builds Autotools-based build has always done this so this is for consistency. However, the CMake build won't create the DEF file when building for Cygwin or MSYS2 because in that context it should be useless. (If Cygwin or MSYS2 is used to host building of normal Windows binaries then the DEF file is still created.)	2024-09-29 01:26:45 +03:00
Lasse Collin	da4f275bd1	CMake: Fix Windows resource file dependencies If common_w32res.rc is modified, the resource files need to be rebuilt. In contrast, the liblzma*.map files truly are link dependencies.	2024-09-29 01:26:13 +03:00
Lasse Collin	1c673c0aac	CMake: Checking for CYGWIN covers MSYS2 too On MSYS2, both CYGWIN and MSYS are set.	2024-09-29 01:26:13 +03:00
Lasse Collin	6aaa0173b8	Translations: Add the SPDX license identifier to pt_BR.po	2024-09-28 09:38:13 +03:00

1 2 3 4 5 ...

2710 Commits