Enzo patches part2 by smfrench · Pull Request #175 · smfrench/smb3-kernel

smfrench · 2026-04-20T20:16:09Z

8 compression patches from Enzo

@dst

@dst buffer is allocated with same size as @src, which, for good compression cases, works fine. However, when compression goes bad (e.g. random bytes payloads), the compressed size can increase significantly, and even by stopping the main loop at 7/8 of @slen, writing leftover literals could write past the end of @dst because of LZ77 metadata. To fix this, add lz77_compressed_alloc_size() helper to compute the correct allocation size for @dst, accounting for metadata and worst cast scenario (all literals). While this is overprovisioning memory, it's not only correct, but also allows lz77_compress() main loop to run without ever checking @dst limits (i.e. a perf improvement). Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>

@Cur

- lz77_match_len() increments @Cur before checking for equality, leading to off-by-one match len in some cases. Fix by moving pointers increment to inside the loop. Also rename @wnd arg to @match (more accurate name). - both lz77_match_len() and lz77_compress() checked for "buf + step < end" when the correct is "<=" for such cases. Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>

Increase max distance (i.e. window size) from 1k to 8k. This allows better compression and is just as fast. Other: - drop LZ77_MATCH_MIN_DIST as it's nused -- main loop already checks if dist > 0 Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>

End-of-stream flag could lead to UB because of int promotion (overwriting signed bit). Fix it by changing operand from '1' to '1UL'. Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>

This patch implements several micro-optimizations on lz77_compress() with the goal of reducing the number of instructions per [input] byte (a.k.a. IPB). Changes: - change hashtable to be u32 (instead of u64) -- change the hash function to reflect that (adds lz77_hash() and lz77_read32() helpers) - batch-write literals instead of 1 by 1 -- now that we have a well defined hot path (match finding) and a cold path (encode literals + match), batch writing makes a significant difference - implement adaptive skipping of input bytes -- skip input bytes more aggressively if too few matches are being found - name some constants for more meaningful context Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>

Document parts of the code, especially the apparently non-sense parts. Other: - change pointer increment constants to sizeof() values Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>

Add compress/common.h to aggregate helpers and definitions that will be shared with other compression algorithms. Also add a few build time checks for proper support. Changes: - update affected call paths in compress/lz77.c Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>

Set it to number of currently defined algorithms (6 as of now). Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>

Copilot

Pull request overview

This PR continues the SMB3 message-compression work by refactoring the client-side LZ77 compressor for performance, introducing shared compression helpers, and adjusting buffer sizing to support worst-case output.

Changes:

Add a shared compress/common.h with hashing and unaligned memory access helpers used by compression algorithms.
Refactor compress/lz77.c to use the new common helpers and a new literals/match encoding flow (including adaptive skipping).
Add lz77_compressed_alloc_size() and switch smb_compress() to allocate the worst-case output buffer size up front.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
fs/smb/common/smb2pdu.h	Adds `SMB3_COMPRESS_MAX_ALGS` constant for compression algorithm indexing.
fs/smb/client/compress/lz77.h	Adds `lz77_compressed_alloc_size()` helper and updates function signature constness.
fs/smb/client/compress/lz77.c	Major LZ77 compressor refactor using shared helpers and new encoding/match-finding structure.
fs/smb/client/compress/common.h	New header providing shared hashing, match-length, and unaligned access helpers.
fs/smb/client/compress.c	Uses `lz77_compressed_alloc_size()` to allocate sufficient destination buffer for LZ77.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-20T20:24:19Z

+#ifndef __LITTLE_ENDIAN /* TODO */
+# error "SMB3 compression is only supported on little endian architectures"
+#endif /* !__LITTLE_ENDIAN */


common.h hard-fails the build on big-endian via #error, but CONFIG_CIFS_COMPRESSION has no Kconfig dependency that restricts endianness. This will break CIFS_COMPRESSION builds on big-endian architectures; either add an appropriate Kconfig depends on !CPU_BIG_ENDIAN (or equivalent) or make the compression code endian-safe (e.g., use explicit little-endian unaligned accesses for the on-wire format) and remove the build-time error.

Copilot · 2026-04-20T20:24:20Z

+#if BITS_PER_LONG < 64 /* TODO */
+# error "SMB3 compression is only supported on 64-bit architectures"
+#endif /* BITS_PER_LONG < 64 */
+
+/* Build time double checks (probably unnecessary) */
+static_assert(sizeof(u16) == 2);
+static_assert(sizeof(u32) == 4);
+static_assert(sizeof(u64) == 8);
+static_assert(sizeof(size_t) == 8);


common.h hard-fails the build on 32-bit (BITS_PER_LONG < 64) but CONFIG_CIFS_COMPRESSION is not restricted in Kconfig. If 64-bit-only is intended, enforce it in Kconfig; otherwise remove this build-time error and keep the implementation portable (including removing/relaxing static_assert(sizeof(size_t) == 8)).

Suggested change

#if BITS_PER_LONG < 64 /* TODO */

# error "SMB3 compression is only supported on 64-bit architectures"

#endif /* BITS_PER_LONG < 64 */

/* Build time double checks (probably unnecessary) */

static_assert(sizeof(u16) == 2);

static_assert(sizeof(u32) == 4);

static_assert(sizeof(u64) == 8);

static_assert(sizeof(size_t) == 8);

/* Build time double checks (probably unnecessary) */

static_assert(sizeof(u16) == 2);

static_assert(sizeof(u32) == 4);

static_assert(sizeof(u64) == 8);

static_assert(sizeof(size_t) >= sizeof(u32));

Copilot · 2026-04-20T20:24:20Z

+#undef MEM_UNALIGNED_READ
+#undef MEM_UNALIGNED_WRITE
+
+/* Read prefetch */
+#define MEM_PREFETCH(ptr)		__builtin_prefetch((ptr), 0, 3)
+
+#ifdef CONFIG_X86
+# define MEM_UNALIGNED_READ(ptr, t)	(*(const t *)(ptr))
+# define MEM_UNALIGNED_WRITE(ptr, v, t)	(*(t *)(ptr) = (t)(v))
+#else
+# include <linux/unaligned.h>
+# define MEM_UNALIGNED_READ(ptr, t)	get_unaligned((const t *)(ptr))
+# define MEM_UNALIGNED_WRITE(ptr, v, t)	put_unaligned((v), (t *)(ptr))
+#endif /* !CONFIG_X86 */
+
+#define mem_read8(ptr)			MEM_UNALIGNED_READ(ptr, u8)
+#define mem_read16(ptr)			MEM_UNALIGNED_READ(ptr, u16)
+#define mem_read32(ptr)			MEM_UNALIGNED_READ(ptr, u32)
+#define mem_read64(ptr)			MEM_UNALIGNED_READ(ptr, u64)
+#define mem_write8(ptr, v)		MEM_UNALIGNED_WRITE(ptr, v, u8)
+#define mem_write16(ptr, v)		MEM_UNALIGNED_WRITE(ptr, v, u16)
+#define mem_write32(ptr, v)		MEM_UNALIGNED_WRITE(ptr, v, u32)


This header introduces very generic macros (MEM_PREFETCH, MEM_UNALIGNED_READ/WRITE) and also #undefs them, which can unexpectedly affect other headers/code included after this one. To match the stated naming convention and avoid macro namespace collisions, please rename these to a COMPRESS_ prefix (and avoid #undef in a public header unless absolutely required).

Suggested change

#undef MEM_UNALIGNED_READ

#undef MEM_UNALIGNED_WRITE

/* Read prefetch */

#define MEM_PREFETCH(ptr) __builtin_prefetch((ptr), 0, 3)

#ifdef CONFIG_X86

# define MEM_UNALIGNED_READ(ptr, t) (*(const t *)(ptr))

# define MEM_UNALIGNED_WRITE(ptr, v, t) (*(t *)(ptr) = (t)(v))

#else

# include <linux/unaligned.h>

# define MEM_UNALIGNED_READ(ptr, t) get_unaligned((const t *)(ptr))

# define MEM_UNALIGNED_WRITE(ptr, v, t) put_unaligned((v), (t *)(ptr))

#endif /* !CONFIG_X86 */

#define mem_read8(ptr) MEM_UNALIGNED_READ(ptr, u8)

#define mem_read16(ptr) MEM_UNALIGNED_READ(ptr, u16)

#define mem_read32(ptr) MEM_UNALIGNED_READ(ptr, u32)

#define mem_read64(ptr) MEM_UNALIGNED_READ(ptr, u64)

#define mem_write8(ptr, v) MEM_UNALIGNED_WRITE(ptr, v, u8)

#define mem_write16(ptr, v) MEM_UNALIGNED_WRITE(ptr, v, u16)

#define mem_write32(ptr, v) MEM_UNALIGNED_WRITE(ptr, v, u32)

/* Read prefetch */

#define COMPRESS_MEM_PREFETCH(ptr) __builtin_prefetch((ptr), 0, 3)

#ifdef CONFIG_X86

# define COMPRESS_MEM_UNALIGNED_READ(ptr, t) (*(const t *)(ptr))

# define COMPRESS_MEM_UNALIGNED_WRITE(ptr, v, t) (*(t *)(ptr) = (t)(v))

#else

# include <linux/unaligned.h>

# define COMPRESS_MEM_UNALIGNED_READ(ptr, t) get_unaligned((const t *)(ptr))

# define COMPRESS_MEM_UNALIGNED_WRITE(ptr, v, t) put_unaligned((v), (t *)(ptr))

#endif /* !CONFIG_X86 */

#define mem_read8(ptr) COMPRESS_MEM_UNALIGNED_READ(ptr, u8)

#define mem_read16(ptr) COMPRESS_MEM_UNALIGNED_READ(ptr, u16)

#define mem_read32(ptr) COMPRESS_MEM_UNALIGNED_READ(ptr, u32)

#define mem_read64(ptr) COMPRESS_MEM_UNALIGNED_READ(ptr, u64)

#define mem_write8(ptr, v) COMPRESS_MEM_UNALIGNED_WRITE(ptr, v, u8)

#define mem_write16(ptr, v) COMPRESS_MEM_UNALIGNED_WRITE(ptr, v, u16)

#define mem_write32(ptr, v) COMPRESS_MEM_UNALIGNED_WRITE(ptr, v, u32)

Copilot · 2026-04-20T20:24:20Z

+static __always_inline u32 lz77_compressed_alloc_size(const u32 size)
+{
+	return size + (size >> 3) + 8;
+}


lz77_compressed_alloc_size() does arithmetic in u32 and can overflow for sufficiently large inputs (e.g., near the 4GiB SMB2 write-length limit), which would under-allocate the destination buffer and could lead to memory corruption. Consider computing in size_t and using overflow-safe helpers (e.g., check_add_overflow()), and clamp/return an error when the required allocation doesn't fit.

Copilot · 2026-04-20T20:24:20Z

+	MEM_PREFETCH(srcp + COMPRESS_RSTEP_SIZE);
+
+	/*
+	 * Adjust @srcp so we don't get a false positive match on first iteration.
+	 * Then prepare hash for first loop iteration (don't advance @srcp again).
+	 */
+	hash = compress_hash_ptr(srcp++);
+	htable[hash] = 0;
+	hash = compress_hash_ptr(srcp);


lz77_compress() seeds the rolling hash with compress_hash_ptr(srcp) / compress_hash_ptr(srcp++), which unconditionally reads 4 bytes from src. If slen < COMPRESS_RSTEP_SIZE (or very small buffers in general), this will read past the end of the input. Please add a clear precondition check/early return for small slen (e.g., encode everything as literals or return -EMSGSIZE) before any mem_read32() hashing/compare operations.

ematsumiya added 8 commits April 20, 2026 15:12

smb: client: compress: add code docs to lz77.c

c070253

Document parts of the code, especially the apparently non-sense parts. Other: - change pointer increment constants to sizeof() values Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>

smb: common: add SMB3_COMPRESS_MAX_ALGS

cf0a13b

Set it to number of currently defined algorithms (6 as of now). Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>

smfrench requested review from Copilot and ematsumiya April 20, 2026 20:16

Copilot started reviewing on behalf of smfrench April 20, 2026 20:17 View session

Copilot AI reviewed Apr 20, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enzo patches part2#175

Enzo patches part2#175
smfrench wants to merge 8 commits into
code-reviewfrom
enzo-patches-part2

smfrench commented Apr 20, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 20, 2026

Uh oh!

Copilot AI Apr 20, 2026

Uh oh!

Copilot AI Apr 20, 2026

Uh oh!

Copilot AI Apr 20, 2026

Uh oh!

Copilot AI Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

smfrench commented Apr 20, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants