Skip to content

Enzo patches part2#175

Open
smfrench wants to merge 8 commits into
code-reviewfrom
enzo-patches-part2
Open

Enzo patches part2#175
smfrench wants to merge 8 commits into
code-reviewfrom
enzo-patches-part2

Conversation

@smfrench

Copy link
Copy Markdown
Owner

8 compression patches from Enzo

@dst buffer is allocated with same size as @src, which, for good
compression cases, works fine.

However, when compression goes bad (e.g. random bytes payloads), the
compressed size can increase significantly, and even by stopping the
main loop at 7/8 of @slen, writing leftover literals could write past
the end of @dst because of LZ77 metadata.

To fix this, add lz77_compressed_alloc_size() helper to compute the
correct allocation size for @dst, accounting for metadata and worst
cast scenario (all literals).

While this is overprovisioning memory, it's not only correct, but also
allows lz77_compress() main loop to run without ever checking @dst
limits (i.e. a perf improvement).

Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
- lz77_match_len() increments @Cur before checking for equality,
  leading to off-by-one match len in some cases.

  Fix by moving pointers increment to inside the loop.
  Also rename @wnd arg to @match (more accurate name).
- both lz77_match_len() and lz77_compress() checked for
  "buf + step < end" when the correct is "<=" for such cases.

Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
Increase max distance (i.e. window size) from 1k to 8k.
This allows better compression and is just as fast.

Other:
- drop LZ77_MATCH_MIN_DIST as it's nused -- main loop
  already checks if dist > 0

Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
End-of-stream flag could lead to UB because of int promotion
(overwriting signed bit).

Fix it by changing operand from '1' to '1UL'.

Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
This patch implements several micro-optimizations on lz77_compress()
with the goal of reducing the number of instructions per [input]
byte (a.k.a. IPB).

Changes:
- change hashtable to be u32 (instead of u64) -- change the hash
  function to reflect that (adds lz77_hash() and lz77_read32() helpers)
- batch-write literals instead of 1 by 1 -- now that we have a well
  defined hot path (match finding) and a cold path (encode literals +
  match), batch writing makes a significant difference
- implement adaptive skipping of input bytes -- skip input bytes more
  aggressively if too few matches are being found
- name some constants for more meaningful context

Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
Document parts of the code, especially the apparently
non-sense parts.

Other:
- change pointer increment constants to sizeof() values

Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
Add compress/common.h to aggregate helpers and definitions that will be
shared with other compression algorithms.

Also add a few build time checks for proper support.

Changes:
- update affected call paths in compress/lz77.c

Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
Set it to number of currently defined algorithms (6 as of now).

Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR continues the SMB3 message-compression work by refactoring the client-side LZ77 compressor for performance, introducing shared compression helpers, and adjusting buffer sizing to support worst-case output.

Changes:

  • Add a shared compress/common.h with hashing and unaligned memory access helpers used by compression algorithms.
  • Refactor compress/lz77.c to use the new common helpers and a new literals/match encoding flow (including adaptive skipping).
  • Add lz77_compressed_alloc_size() and switch smb_compress() to allocate the worst-case output buffer size up front.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
fs/smb/common/smb2pdu.h Adds SMB3_COMPRESS_MAX_ALGS constant for compression algorithm indexing.
fs/smb/client/compress/lz77.h Adds lz77_compressed_alloc_size() helper and updates function signature constness.
fs/smb/client/compress/lz77.c Major LZ77 compressor refactor using shared helpers and new encoding/match-finding structure.
fs/smb/client/compress/common.h New header providing shared hashing, match-length, and unaligned access helpers.
fs/smb/client/compress.c Uses lz77_compressed_alloc_size() to allocate sufficient destination buffer for LZ77.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +29 to +31
#ifndef __LITTLE_ENDIAN /* TODO */
# error "SMB3 compression is only supported on little endian architectures"
#endif /* !__LITTLE_ENDIAN */

Copilot AI Apr 20, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

common.h hard-fails the build on big-endian via #error, but CONFIG_CIFS_COMPRESSION has no Kconfig dependency that restricts endianness. This will break CIFS_COMPRESSION builds on big-endian architectures; either add an appropriate Kconfig depends on !CPU_BIG_ENDIAN (or equivalent) or make the compression code endian-safe (e.g., use explicit little-endian unaligned accesses for the on-wire format) and remove the build-time error.

Copilot uses AI. Check for mistakes.
Comment on lines +33 to +41
#if BITS_PER_LONG < 64 /* TODO */
# error "SMB3 compression is only supported on 64-bit architectures"
#endif /* BITS_PER_LONG < 64 */

/* Build time double checks (probably unnecessary) */
static_assert(sizeof(u16) == 2);
static_assert(sizeof(u32) == 4);
static_assert(sizeof(u64) == 8);
static_assert(sizeof(size_t) == 8);

Copilot AI Apr 20, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

common.h hard-fails the build on 32-bit (BITS_PER_LONG < 64) but CONFIG_CIFS_COMPRESSION is not restricted in Kconfig. If 64-bit-only is intended, enforce it in Kconfig; otherwise remove this build-time error and keep the implementation portable (including removing/relaxing static_assert(sizeof(size_t) == 8)).

Suggested change
#if BITS_PER_LONG < 64 /* TODO */
# error "SMB3 compression is only supported on 64-bit architectures"
#endif /* BITS_PER_LONG < 64 */
/* Build time double checks (probably unnecessary) */
static_assert(sizeof(u16) == 2);
static_assert(sizeof(u32) == 4);
static_assert(sizeof(u64) == 8);
static_assert(sizeof(size_t) == 8);
/* Build time double checks (probably unnecessary) */
static_assert(sizeof(u16) == 2);
static_assert(sizeof(u32) == 4);
static_assert(sizeof(u64) == 8);
static_assert(sizeof(size_t) >= sizeof(u32));

Copilot uses AI. Check for mistakes.
Comment on lines +66 to +87
#undef MEM_UNALIGNED_READ
#undef MEM_UNALIGNED_WRITE

/* Read prefetch */
#define MEM_PREFETCH(ptr) __builtin_prefetch((ptr), 0, 3)

#ifdef CONFIG_X86
# define MEM_UNALIGNED_READ(ptr, t) (*(const t *)(ptr))
# define MEM_UNALIGNED_WRITE(ptr, v, t) (*(t *)(ptr) = (t)(v))
#else
# include <linux/unaligned.h>
# define MEM_UNALIGNED_READ(ptr, t) get_unaligned((const t *)(ptr))
# define MEM_UNALIGNED_WRITE(ptr, v, t) put_unaligned((v), (t *)(ptr))
#endif /* !CONFIG_X86 */

#define mem_read8(ptr) MEM_UNALIGNED_READ(ptr, u8)
#define mem_read16(ptr) MEM_UNALIGNED_READ(ptr, u16)
#define mem_read32(ptr) MEM_UNALIGNED_READ(ptr, u32)
#define mem_read64(ptr) MEM_UNALIGNED_READ(ptr, u64)
#define mem_write8(ptr, v) MEM_UNALIGNED_WRITE(ptr, v, u8)
#define mem_write16(ptr, v) MEM_UNALIGNED_WRITE(ptr, v, u16)
#define mem_write32(ptr, v) MEM_UNALIGNED_WRITE(ptr, v, u32)

Copilot AI Apr 20, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This header introduces very generic macros (MEM_PREFETCH, MEM_UNALIGNED_READ/WRITE) and also #undefs them, which can unexpectedly affect other headers/code included after this one. To match the stated naming convention and avoid macro namespace collisions, please rename these to a COMPRESS_ prefix (and avoid #undef in a public header unless absolutely required).

Suggested change
#undef MEM_UNALIGNED_READ
#undef MEM_UNALIGNED_WRITE
/* Read prefetch */
#define MEM_PREFETCH(ptr) __builtin_prefetch((ptr), 0, 3)
#ifdef CONFIG_X86
# define MEM_UNALIGNED_READ(ptr, t) (*(const t *)(ptr))
# define MEM_UNALIGNED_WRITE(ptr, v, t) (*(t *)(ptr) = (t)(v))
#else
# include <linux/unaligned.h>
# define MEM_UNALIGNED_READ(ptr, t) get_unaligned((const t *)(ptr))
# define MEM_UNALIGNED_WRITE(ptr, v, t) put_unaligned((v), (t *)(ptr))
#endif /* !CONFIG_X86 */
#define mem_read8(ptr) MEM_UNALIGNED_READ(ptr, u8)
#define mem_read16(ptr) MEM_UNALIGNED_READ(ptr, u16)
#define mem_read32(ptr) MEM_UNALIGNED_READ(ptr, u32)
#define mem_read64(ptr) MEM_UNALIGNED_READ(ptr, u64)
#define mem_write8(ptr, v) MEM_UNALIGNED_WRITE(ptr, v, u8)
#define mem_write16(ptr, v) MEM_UNALIGNED_WRITE(ptr, v, u16)
#define mem_write32(ptr, v) MEM_UNALIGNED_WRITE(ptr, v, u32)
/* Read prefetch */
#define COMPRESS_MEM_PREFETCH(ptr) __builtin_prefetch((ptr), 0, 3)
#ifdef CONFIG_X86
# define COMPRESS_MEM_UNALIGNED_READ(ptr, t) (*(const t *)(ptr))
# define COMPRESS_MEM_UNALIGNED_WRITE(ptr, v, t) (*(t *)(ptr) = (t)(v))
#else
# include <linux/unaligned.h>
# define COMPRESS_MEM_UNALIGNED_READ(ptr, t) get_unaligned((const t *)(ptr))
# define COMPRESS_MEM_UNALIGNED_WRITE(ptr, v, t) put_unaligned((v), (t *)(ptr))
#endif /* !CONFIG_X86 */
#define mem_read8(ptr) COMPRESS_MEM_UNALIGNED_READ(ptr, u8)
#define mem_read16(ptr) COMPRESS_MEM_UNALIGNED_READ(ptr, u16)
#define mem_read32(ptr) COMPRESS_MEM_UNALIGNED_READ(ptr, u32)
#define mem_read64(ptr) COMPRESS_MEM_UNALIGNED_READ(ptr, u64)
#define mem_write8(ptr, v) COMPRESS_MEM_UNALIGNED_WRITE(ptr, v, u8)
#define mem_write16(ptr, v) COMPRESS_MEM_UNALIGNED_WRITE(ptr, v, u16)
#define mem_write32(ptr, v) COMPRESS_MEM_UNALIGNED_WRITE(ptr, v, u32)

Copilot uses AI. Check for mistakes.
Comment on lines +37 to +40
static __always_inline u32 lz77_compressed_alloc_size(const u32 size)
{
return size + (size >> 3) + 8;
}

Copilot AI Apr 20, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lz77_compressed_alloc_size() does arithmetic in u32 and can overflow for sufficiently large inputs (e.g., near the 4GiB SMB2 write-length limit), which would under-allocate the destination buffer and could lead to memory corruption. Consider computing in size_t and using overflow-safe helpers (e.g., check_add_overflow()), and clamp/return an error when the required allocation doesn't fit.

Copilot uses AI. Check for mistakes.
Comment on lines +165 to +173
MEM_PREFETCH(srcp + COMPRESS_RSTEP_SIZE);

/*
* Adjust @srcp so we don't get a false positive match on first iteration.
* Then prepare hash for first loop iteration (don't advance @srcp again).
*/
hash = compress_hash_ptr(srcp++);
htable[hash] = 0;
hash = compress_hash_ptr(srcp);

Copilot AI Apr 20, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lz77_compress() seeds the rolling hash with compress_hash_ptr(srcp) / compress_hash_ptr(srcp++), which unconditionally reads 4 bytes from src. If slen < COMPRESS_RSTEP_SIZE (or very small buffers in general), this will read past the end of the input. Please add a clear precondition check/early return for small slen (e.g., encode everything as literals or return -EMSGSIZE) before any mem_read32() hashing/compare operations.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants