liblzma: Avoid implementation-defined behavior in the RISC-V filter.

GCC docs promise that it works and a few other compilers do too. Clang/LLVM is documented source code only but unsurprisingly it behaves the same as others on x86-64 at least. But the certainly-portable way is good enough here so use that.
2026-03-14 21:58:13 +00:00 · 2024-02-17 16:01:32 +02:00 · 2024-02-17 16:01:32 +02:00 · f1d6b88aef
commit f1d6b88aef
parent 843ddc5f61
1 changed files with 22 additions and 8 deletions
--- a/src/liblzma/simple/riscv.c
+++ b/src/liblzma/simple/riscv.c
@ -511,14 +511,28 @@ riscv_encode(void *simple lzma_attribute((__unused__)),
 				// be the same.
 				// Arithmetic right shift makes sign extension
-				// trivial but C doesn't guarantee it for
+				// trivial but (1) it's implementation-defined
-				// signed integers so a fallback is provided
+				// behavior (C99/C11/C23 6.5.7-p5) and so is
-				// for portability.
+				// (2) casting unsigned to signed (6.3.1.3-p3).
 				//
 				// One can check for (1) with
 				//
 				//     if ((-1 >> 1) == -1) ...
 				//
 				// but (2) has to be checked from the
 				// compiler docs. GCC promises that (1)
 				// and (2) behave in the common expected
 				// way and thus
 				//
 				//     addr += (uint32_t)(
 				//             (int32_t)inst2 >> 20);
 				//
 				// does the same as the code below. But since
 				// the 100 % portable way is only a few bytes
 				// bigger code and there is no real speed
 				// difference, let's just use that, especially
 				// since the decoder doesn't need this at all.
 				uint32_t addr = inst & 0xFFFFF000;
 				if ((-1 >> 1) == -1)
 					addr += (uint32_t)(
 						(int32_t)inst2 >> 20);
 				else
 				addr += (inst2 >> 20)
 						- ((inst2 >> 19) & 0x1000);