Commit Graph

2744 Commits

Author SHA1 Message Date
Lasse Collin ca529c3f41
Add tuklib_mbstr_wrap for automatic word wrapping
Automatic word wrapping makes translators' work easier and reduces
errors like misaligned columns or overlong lines. Right-to-left
languages and languages that don't use spaces between words will
still need extra effort. (xz hasn't been translated to any RTL
language so far.)
2024-12-18 17:09:31 +02:00
Lasse Collin 314b83ceba
Build: Sort filenames to ASCII order in Makefile.am 2024-12-18 17:09:31 +02:00
Lasse Collin df399c5255
tuklib_mbstr_width: Add tuklib_mbstr_width_mem()
It's a new function split from tuklib_mbstr_width().
It's useful with partial strings that aren't terminated with \0.
2024-12-18 17:09:30 +02:00
Lasse Collin 51081efae4
tuklib_mbstr_width: Update a comment about shift states 2024-12-18 17:09:30 +02:00
Lasse Collin 7ff1b0ac53
tuklib_mbstr_width: Don't mention shift states in the API docs
It is assumed that this code won't be used with charsets that use
locking shift states.
2024-12-18 17:09:30 +02:00
Lasse Collin 3c16105936
tuklib_mbstr_width: Use stricter return value checking
This should make no difference in practice (at least if mbrtowc()
isn't broken).
2024-12-18 17:09:30 +02:00
Lasse Collin b797c44c42
tuklib_mbstr_width: Change the behavior when wcwidth() is not available
If wcwidth() isn't available (Windows), previously it was assumed
that one byte == one column in the terminal. Now it is assumed that
one multibyte character == one column. This works better with UTF-8.
Languages that only use single-width characters without any combining
characters should work correctly with this.

In xz, none of po/*.po contain combining characters and only ko.po,
zh_CN.po, and zh_TW.po contain fullwidth characters. Thus, "only"
those three translations in xz are broken on Windows with the
UTF-8 code page. Broken means that column headings in xz -lvv and
(only in the master branch) strings in --long-help are misaligned,
so it's not a huge problem. I don't know if those three languages
displayed perfectly before the UTF-8 change because I hadn't tested
translations with native Windows builds before.

Fixes: 46ee006162
2024-12-18 17:09:30 +02:00
Lasse Collin 78868b6ed6
xzdec: Use setlocale() via tuklib_gettext_setlocale()
xzdec isn't translated and didn't have locale-specific behavior
in the past. On Windows with UTF-8 in the application manifest,
setting the locale makes a difference though:

  - Without any setlocale() call, non-ASCII filenames don't display
    properly in Command Prompt unless one first uses "chcp 65001"
    to set the console code page to UTF-8.

  - setlocale(LC_ALL, "") is enough to make non-ASCII filenames
    print correctly in Command Prompt without using "chcp 65001",
    assuming that the non-UTF-8 code page (like 850) supports
    those non-ASCII characters.

  - setlocale(LC_ALL, ".UTF8") is even better because then mbrtowc() and
    such functions use an UTF-8 locale instead of a legacy code page.
    The tuklib_gettext_setlocale() macro takes care of this (without
    enabling any translations).

Fixes: 46ee006162
2024-12-18 17:09:30 +02:00
Lasse Collin 0d0b574cc4
Windows: Use UTF-8 locale when active code page is UTF-8
XZ Utils 5.6.3 set the active code page to UTF-8 to fix CVE-2024-47611.
This wasn't paired with UCRT-specific setlocale(LC_ALL, ".UTF8"), thus
non-ASCII characters from translations became mojibake.

Fixes: 46ee006162
2024-12-18 17:09:30 +02:00
Lasse Collin 20dfca8171
Windows: Document the need for setlocale(LC_ALL, ".UTF8")
Also warn about unpaired surrogates and (somewhat UTF-8-specific)
MAX_PATH issue in FindFirstFileA().

Fixes: 46ee006162
2024-12-18 17:09:29 +02:00
Lasse Collin 4e936f2340
xzdec: Call tuklib_progname_init() early enough
If the early pledge() call on OpenBSD fails, it calls my_errorf()
which requires the "progname" variable.

Fixes: d74fb5f060
2024-12-18 17:09:29 +02:00
Lasse Collin 61feaf681b
CMake: Bump maximum policy version to 3.31
With CMake 3.31, there were a few warnings from
CMP0177 "install() DESTINATION paths are normalized".
These occurred because the install(FILES) command in
my_install_man_lang() is called with a DESTINATION path
that contains two consecutive slashes, for example,
"share/man//man1". Such a path is for the English man pages.
With translated man pages, the language code goes between
the slashes. The warning was probably triggered because the
extra slash gets removed by the normalization.
2024-12-18 17:09:29 +02:00
Lasse Collin b0bb84dd7b
Update THANKS 2024-12-18 17:09:29 +02:00
Dexter Castor Döpping bee0c044d3
liblzma: Fix incorrect macro name in a comment
Fixes: 33b8a24b66
Closes: https://github.com/tukaani-project/xz/pull/155
2024-12-18 17:09:29 +02:00
Lasse Collin 2cfa1ad0a9
license-check.sh: Add an exception for doc/SHA256SUMS
Fixes: 36b531022f
2024-12-18 17:09:21 +02:00
Lasse Collin 36b531022f
doc/SHA256SUMS: Add the list of SHA-256 hashes of release files
The release files are signed but verifying the signatures cannot
catch certain types of attacks:

1. A malicious maintainer could make more than one variant of
   a package. One could be for general distribution. Another
   with malicious content could be targeted to specific users,
   for example, distributing the malicious version on a mirror
   controlled by the attacker.

2. If the signing key of an honest maintainer was compromised
   without being detected, a similar situation as described
   above could occur.

SHA256SUMS could be put on the project website but having it in
the Git repository makes it obvious that old lines aren't modified
when the file is updated.

Hashes of uncompressed files are included too. This way tarballs
can be recompressed and the hashes can still be verified.
2024-12-01 21:38:17 +02:00
Lasse Collin fe9e66993f Docs: Remove .github/SECURITY.md
One of the reasons to have this file in the xz repository was to
show vulnerability reporting info in the Security section on GitHub.
On 2024-11-25, I added SECURITY.md to the tukaani-project organization
on GitHub:

    https://github.com/tukaani-project/.github/blob/main/SECURITY.md

GitHub shows that file in all projects in the organization unless
overridden by a project-specific SECURITY.md. Thus, removing
the file from the xz repo makes GitHub show the organization-wide
text instead.

Maintaining a single copy for the whole GitHub organization makes
things simpler. It's also nicer to have fewer GitHub-specific files
in the xz repo. Information how to report bugs (including security
issues) is available in README and on the home page too.

The OpenSSF Scorecard tool didn't find .github/SECURITY.md from the
xz repository. There was a suggestion to move the file to the top-level
directory where Scorecard should find it. However, Scorecard does find
the organization-wide SECURITY.md. Thus, the file isn't needed in the
xz repository to score points in the Scorecard game:

    https://scorecard.dev/viewer/?uri=github.com/tukaani-project/xz

Closes: https://github.com/tukaani-project/xz/issues/148
Closes: https://github.com/tukaani-project/xz/pull/149
2024-11-30 12:05:59 +02:00
Lasse Collin b361772736 Translations: Update the Chinese (traditional) translation 2024-11-30 10:27:14 +02:00
Lasse Collin c15115f7ed liblzma: Optimize the loop conditions in BCJ filters
Compilers cannot optimize the addition "i + 4" away since theoretically
it could overflow.
2024-11-26 19:17:42 +02:00
Lasse Collin 9f69e71e78 Update THANKS 2024-11-25 16:26:54 +02:00
Mark Wielaard 48ff3f0652 xz: Landlock: Fix a file descriptor leak 2024-11-25 12:28:44 +02:00
Sam James dbca3d078e CI: update FreeBSD, NetBSD, OpenBSD, Solaris actions
Checked the changes and they're all innocuous. This should hopefully
fix the "externally managed" pip error in these jobs that started
recently.
2024-10-02 10:10:54 +03:00
Lasse Collin a94b85bea3 Add NEWS for 5.6.3 2024-10-01 20:06:54 +03:00
Lasse Collin be4bf94446 cmake/tuklib_large_file_support.cmake: Add a missing include
v5.2 didn't build with CMake. Other branches had
include(CMakePushCheckState) in top-level CMakeLists.txt
which made the build work.

Fixes: 597f49b614
2024-10-01 14:49:41 +03:00
Lasse Collin 1ebbe915d4 Update THANKS 2024-10-01 12:10:23 +03:00
Lasse Collin 74702ee00e Tests/Windows: Add the application manifest to the test programs
This ensures that the test programs get executed the same way as
the binaries that are installed.
2024-10-01 12:10:23 +03:00
Lasse Collin 7ddf2273e0 license-check.sh: Add an exception for w32_application.manifest
The file gets embedded as is into executables, thus it cannot
hold a license identifier.
2024-10-01 12:10:23 +03:00
Lasse Collin 46ee006162 Windows: Embed an application manifest in the EXE files
IMPORTANT: This includes a security fix to command line tool
           argument handling.

Some toolchains embed an application manifest by default to declare
UAC-compliance. Some also declare compatibility with Vista/8/8.1/10/11
to let the app access features newer than those of Vista.

We want all the above but also two more things:

  - Declare that the app is long path aware to support paths longer
    than 259 characters (this may also require a registry change).

  - Force the code page to UTF-8. This allows the command line tools
    to access files whose names contain characters that don't exist
    in the current legacy code page (except unpaired surrogates).
    The UTF-8 code page also fixes security issues in command line
    argument handling which can be exploited with malicious filenames.
    See the new file w32_application.manifest.comments.txt.

Thanks to Orange Tsai and splitline from DEVCORE Research Team
for discovering this issue.

Thanks to Vijay Sarvepalli for reporting the issue to me.

Thanks to Kelvin Lee for testing with MSVC and helping with
the required build system fixes.
2024-10-01 12:10:23 +03:00
Lasse Collin dad1530915 Windows: Set DLL name accurately in StringFileInfo on Cygwin and MSYS2
Now the information in the "Details" tab in the file properties
dialog matches the naming convention of Cygwin and MSYS2. This
is only a cosmetic change.
2024-09-30 16:55:23 +03:00
Lasse Collin 8940ecb96f common_w32res.rc: White space edits
LANGUAGE and VS_VERSION_INFO begin new statements so put an empty line
between them.
2024-09-29 01:27:16 +03:00
Lasse Collin c3b9dad07d CMake: Add the resource files to the Cygwin and MSYS2 builds
Autotools-based build has always done this so this is for consistency.

However, the CMake build won't create the DEF file when building
for Cygwin or MSYS2 because in that context it should be useless.
(If Cygwin or MSYS2 is used to host building of normal Windows
binaries then the DEF file is still created.)
2024-09-29 01:26:45 +03:00
Lasse Collin da4f275bd1 CMake: Fix Windows resource file dependencies
If common_w32res.rc is modified, the resource files need to be rebuilt.
In contrast, the liblzma*.map files truly are link dependencies.
2024-09-29 01:26:13 +03:00
Lasse Collin 1c673c0aac CMake: Checking for CYGWIN covers MSYS2 too
On MSYS2, both CYGWIN and MSYS are set.
2024-09-29 01:26:13 +03:00
Lasse Collin 6aaa0173b8 Translations: Add the SPDX license identifier to pt_BR.po 2024-09-28 09:38:13 +03:00
Lasse Collin dc7b9f24b7 Windows/CMake: Use the correct resource file for lzmadec.exe
CMakeLists.txt was using xzdec_w32res.rc for both xzdec and lzmadec.

Fixes: 998d0b2953
2024-09-25 21:31:06 +03:00
Lasse Collin b834ae5f80 Translations: Update the Brazilian Portuguese translation 2024-09-25 21:29:59 +03:00
Lasse Collin eceb023d4c Update THANKS 2024-09-17 01:26:02 +03:00
Tobias Stoeckmann 76cfd0a9bb lzmainfo: Avoid integer overflow
The MB output can overflow with huge numbers. Most likely these are
invalid .lzma files anyway, but let's avoid garbage output.

lzmadec was adapted from LZMA Utils. The original code with this bug
was written in 2005, over 19 years ago.

Co-authored-by: Lasse Collin <lasse.collin@tukaani.org>
Closes: https://github.com/tukaani-project/xz/pull/144
2024-09-17 01:26:02 +03:00
Tobias Stoeckmann 78355aebb7 xzdec: Remove unused short option -M
"xzdec -M123" exited with exit status 1 without printing
any messages. The "M:" entry should have been removed when
the memory usage limiter support was removed from xzdec.

Fixes: 792331bdee
Closes: https://github.com/tukaani-project/xz/pull/143
[ Lasse: Commit message edits ]
2024-09-16 23:33:29 +03:00
Lasse Collin e5758db7bd Update THANKS 2024-09-10 13:54:47 +03:00
Firas Khalil Khana 80ffa38f56 Build: Fix a typo in autogen.sh
Fixes: e9be74f5b1
Closes: https://github.com/tukaani-project/xz/pull/141
2024-09-10 13:43:00 +03:00
Lasse Collin 68c54e45d0 Translations: Update Chinese (simplified) translation
Differences to the zh_CN.po file from the Translation Project:

  - Two uses of \v were fixed.

  - Missing "OPTS" translation in --riscv[=OPTS] was copied from
    previous lines.

  - "make update-po" was run to remove line numbers from comments.
2024-09-02 20:08:40 +03:00
Lasse Collin 2230692aa1 Translations: Update the Catalan translation
Differences to the ca.po file from the Translation Project:

  - An overlong line translating --filters-help was wrapped.

  - "make update-po" was used to remove line numbers from the comments
    to match the changes in fccebe2b4f
    and 093490b582. xz.pot in the TP
    is older than these commits.
2024-09-02 19:40:50 +03:00
Lasse Collin 3e7723ce26 Update THANKS 2024-09-02 17:33:50 +03:00
Lasse Collin d3e0e679b2 CMake: Don't install lzmadec.1 symlinks if XZ_TOOL_LZMADEC=OFF
Thanks-to: 榆柳松 (ZhengSen Wang) <wzhengsen@gmail.com>
Fixes: fb50c6ba1d
Closes: https://github.com/tukaani-project/xz/pull/134
2024-09-02 17:33:42 +03:00
Lasse Collin acdf21033a CMake: Fix the build when XZ_TOOL_LZMADEC=OFF
Co-developed-by: 榆柳松 (ZhengSen Wang) <wzhengsen@gmail.com>
Fixes: fb50c6ba1d
Fixes: https://github.com/tukaani-project/xz/pull/134
2024-09-02 17:33:06 +03:00
Lasse Collin 5e37598750 Update THANKS 2024-08-22 11:01:07 +03:00
Yifeng Li 6cd7c86078 liblzma: Fix x86-64 movzw compatibility in range_decoder.h
Support for instruction "movzw" without suffix in "GNU as" was
added in commit [1] and stabilized in binutils 2.27, released
in August 2016. Earlier systems don't accept this instruction
without a suffix, making range_decoder.h's inline assembly
unable to build on old systems such as Ubuntu 16.04, creating
error messages like:

    lzma_decoder.c: Assembler messages:
    lzma_decoder.c:371: Error: no such instruction: `movzw 2(%r11),%esi'
    lzma_decoder.c:373: Error: no such instruction: `movzw 4(%r11),%edi'
    lzma_decoder.c:388: Error: no such instruction: `movzw 6(%r11),%edx'
    lzma_decoder.c:398: Error: no such instruction: `movzw (%r11,%r14,4),%esi'

Change "movzw" to "movzwl" for compatibility.

[1] https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=c07315e0c610e0e3317b4c02266f81793df253d2

Suggested-by: Lasse Collin <lasse.collin@tukaani.org>
Tested-by: Yifeng Li <tomli@tomli.me>
Signed-off-by: Yifeng Li <tomli@tomli.me>
Fixes: 3182a330c1
Fixes: https://github.com/tukaani-project/xz/issues/121
Closes: https://github.com/tukaani-project/xz/pull/136
2024-08-22 10:59:08 +03:00
Lasse Collin bf901dee5d Build: Comment that elf_aux_info(3) will be available on OpenBSD >= 7.6 2024-07-19 20:06:24 +03:00
Lasse Collin f7103c2c2a Revert "liblzma: Add ARM64 CRC32 instruction support detection on OpenBSD"
This reverts commit dc03f6290f.

OpenBSD 7.6 will support elf_aux_info(3), and the detection code used
on FreeBSD will work on OpenBSD 7.6 too. Keep things simpler and drop
the OpenBSD-specific sysctl() method.

Thanks to Christian Weisgerber.
2024-07-19 20:06:24 +03:00