Add CPU GDN kernel for Qwen3.5 with correctness test and benchmark by Misscheng-eng · Pull Request #687 · UbiquitousLearning/mllm

Misscheng-eng · 2026-06-21T14:23:12Z

Overview

This PR implements the CPU version of the Gated Delta Network (GDN) attention kernel used in Qwen3.5-0.8B.

The goal is to provide a minimal, high-performance CPU backend implementation for GDN, enabling correct forward inference and benchmarking within the mllm CPU execution framework.

Implementation

We implement a pure C++ CPU kernel:

mllm/backends/cpu/kernels/GDNKernel.hpp

The forward computation follows the Qwen3.5 GDN attention formulation:

S_t = α S_{t-1}
- αβ (S_{t-1} k) k^T
+ β v k^T

Where:

α = exp(gate)
β is learned scaling factor
k, v are key/value vectors
S is the state matrix

CPU Integration

We add a lightweight CPU operator:

GDNOp (forward dispatch wrapper)
Registered in CPU backend operator system

No modification is made to the core CPU runtime.

Testing

Correctness Test

Reference comparison against PyTorch implementation
Max error: ~6e-5
Batch sizes tested: 1, 4, 8

Benchmark Results (CPU)

Batch=1:

Latency: 0.207 ms
Throughput: 4822 tok/s

Batch=4:

Latency: 0.854 ms
Throughput: 4682 tok/s

Scope

This PR ONLY targets:

Qwen3.5-0.8B GDN attention kernel
CPU backend integration
correctness + performance validation

No changes to:

third-party libraries
CUDA backend
core CPU runtime architecture

Summary

This implementation provides a correct and efficient CPU baseline for Qwen3.5 GDN attention inference and can serve as a reference backend for further optimization (SIMD / multithreading).

coderabbitai · 2026-06-21T14:23:24Z

📝 Walkthrough

Walkthrough

Adds a new CPU GDN (Gated Delta Network) forward kernel (GDNKernel) with compile-time dimensions D=128 and H=16, implementing exponential gating, state scaling, and rank-1 erase/write updates. Wires it into the op framework via CPUGDNOp and GDNOpFactory, and validates correctness and latency with a standalone C++ smoke test and a PyTorch JIT benchmark.

Changes

CPU GDN Kernel, Op Wiring, and Tests

Layer / File(s)	Summary
GDNKernel forward implementation `mllm/backends/cpu/kernels/GDNKernel.hpp`	Defines `GDNKernel` struct with compile-time `D=128`, `H=16`, and templated `forward<T>` computing `alpha=exp(gate)`, scaling `s_prev`, applying erase and write rank-1 updates via `beta`, `k`, and `v`, and optionally computing `output = S_new * q`.
CPUGDNOp and GDNOpFactory wiring `mllm/backends/cpu/ops/GDNOp.hpp`	Adds `CPUGDNOp` (derives from `aops::GDNOp`) whose `forward()` extracts `float*` pointers from five input tensors and delegates to `GDNKernel::forward`, plus `GDNOpFactory` registering `OpTypes::kGDN`.
C++ smoke test and Python correctness/benchmark `mllm/backends/cpu/ops/test_gdn_kernel.cpp`, `tests/cpu/test_gdn_correctness.py`	`test_gdn_kernel.cpp` runs a small `H=2,D=4` kernel invocation and prints success. `test_gdn_correctness.py` JIT-compiles the kernel via PyBind11, compares against a pure-PyTorch reference with `torch.allclose`, and benchmarks latency/throughput for two batch sizes.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐇 Hippity-hop through matrices bright,
Gates and betas aligned just right.
s_prev decays, then writes anew,
Erase and write — a rank-one coup!
The kernel runs, the test prints "OK",
A GDN hop makes mllm's day! 🌟

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 12.50% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically describes the main change: adding a CPU GDN kernel for Qwen3.5 with tests and benchmarks, which aligns with the changeset.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description check	✅ Passed	The PR description is structured, covers overview, implementation, testing, benchmarks, and scope, and is sufficiently detailed for review.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

coderabbitai

Actionable comments posted: 6

🧹 Nitpick comments (3)

mllm/backends/cpu/ops/test_gdn_kernel.cpp (1)

47-58: ⚡ Quick win

Smoke test never fails on wrong output.

Line 57 always prints success without validating s_new, so this binary won’t catch regressions.

Proposed fix

+#include <cmath>
 `#include` <iostream>
 `#include` <vector>
-#include <cmath>
@@
 int main() {
@@
     gdn_forward(s_prev, k, v, gate, beta, s_new);
 
-    std::cout << "GDN kernel OK\n";
+    // With zero s_prev and k/v initialized as {1,0,0,0} per head:
+    // s_new[h,0,0] should equal beta[h], and all other entries remain 0.
+    for (int h = 0; h < H; ++h) {
+        const int base = h * D * D;
+        if (std::fabs(s_new[base] - beta[h]) > 1e-6f) {
+            std::cerr << "Mismatch at head " << h << ": got " << s_new[base]
+                      << ", expected " << beta[h] << "\n";
+            return 1;
+        }
+    }
+    std::cout << "GDN kernel OK\n";
+    return 0;
 }

As per coding guidelines, “Suggest adding unit tests for untested complex logic or edge cases.”

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@mllm/backends/cpu/ops/test_gdn_kernel.cpp` around lines 47 - 58, The main
function in the test calls gdn_forward with test inputs but never validates the
output array s_new, so the test always succeeds regardless of correctness. Add
validation checks after the gdn_forward call that compare the computed values in
s_new against expected results using assertions or similar checks. This will
ensure regressions in the gdn_forward function are caught when the output
differs from expected values.

Source: Coding guidelines

mllm/backends/cpu/kernels/GDNKernel.hpp (2)

10-24: ⚡ Quick win

Document the raw-pointer layout contract.

The formula comment does not state required shapes/layouts, required non-null pointers, batch_size constraints, optional q/output semantics, or failure behavior for this public raw-pointer API.

As per coding guidelines, “Ensure public APIs, classes, and functions have clear docstrings or comments explaining purpose, parameters, returns, and errors.”
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@mllm/backends/cpu/kernels/GDNKernel.hpp` around lines 10 - 24, The forward
method in the GDNKernel struct lacks documentation of its raw-pointer API
contract. Add a comprehensive docstring to the forward method that specifies the
required shapes and memory layouts of all pointer parameters (s_prev, k, v,
gate, beta, s_new, output, q), which pointers must be non-null and which are
optional, the valid range for batch_size, the semantics of the optional output
and q parameters, and any failure modes or error behavior. This will ensure
users understand the memory requirements and constraints for calling this public
API correctly.
Source: Coding guidelines

5-6: ⚡ Quick win

Avoid heap allocation inside the per-head kernel loop.

std::vector<T> proj(D, 0) allocates for every batch/head. Since D is constexpr, use stack storage and avoid allocator overhead in this CPU hot path.
Proposed fix
-#include <vector>
+#include <array>
 `#include` <cmath>
-                std::vector<T> proj(D, 0);
+                std::array<T, D> proj;
As per coding guidelines, “Avoid unnecessary object creation in loops or hot paths.”

Also applies to: 47-53
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@mllm/backends/cpu/kernels/GDNKernel.hpp` around lines 5 - 6, The
GDNKernel.hpp file allocates a std::vector in the per-head kernel loop (lines
47-53 and similar locations), which causes heap allocation overhead in a
CPU-intensive hot path. Since D is constexpr, replace the std::vector<T> proj(D,
0) allocations with a stack-allocated std::array<T, D> instead. This eliminates
the allocator overhead while maintaining the same functionality, as the size is
known at compile time. Ensure all instances of this pattern throughout the
kernel loop are updated to use the stack-allocated alternative.
Source: Coding guidelines

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@mllm/backends/cpu/kernels/GDNKernel.hpp`:
- Around line 20-23: Remove all trailing whitespace from the end of the lines in
the GDNKernel.hpp file. Specifically, trim any extra spaces at the end of the
parameter declaration lines in the function signature (around the s_prev, k, v,
gate, beta, s_new, output, and q parameters), and also check lines 36-38 and
line 69 for the same issue. Ensure each line ends cleanly without any trailing
spaces, as per the coding guidelines.
- Around line 18-24: The forward method in the GDNKernel template class has a
template parameter mismatch where the output parameter is declared as float* and
the q parameter is declared as const float*, but they should be T* and const T*
respectively to match the template parameter T used throughout the function.
Change the output parameter type from float* to T* and the q parameter type from
const float* to const T* in the forward method signature to ensure type
consistency and avoid undefined behavior when the template is instantiated with
types other than float.

In `@mllm/backends/cpu/ops/GDNOp.hpp`:
- Line 24: Line 24 contains trailing whitespace on a blank line, which violates
coding guidelines that prohibit lines from ending with whitespace. Navigate to
line 24 in the GDNOp.hpp file and remove all trailing whitespace (spaces or
tabs) from that blank line, leaving it completely empty.
- Around line 17-35: The forward() method in the GDNOp class directly accesses
raw float pointers from input and output tensors without validating their
correctness. Before calling GDNKernel::forward(), add validation checks to
ensure: the inputs vector contains exactly 5 tensors and outputs vector contains
exactly 1 tensor, all tensors have float dtype, all tensors are contiguous or
layout-compatible, the batch size B extracted from s_prev matches across all
input/output tensors, and each tensor has the expected shape (specifically
[B,16,128,*] or similar dimensions as per the kernel requirements). Perform
these checks after extracting B but before accessing any raw pointers via
.ptr<float>() calls.

In `@tests/cpu/test_gdn_correctness.py`:
- Line 9: The KERNEL_HEADER_PATH variable on line 9 uses a hardcoded absolute
path that is machine-specific and breaks on CI and other developer machines.
Replace the hardcoded path with a repo-relative path discovery approach by using
the test file's location (__file__) to dynamically construct the path to the
GDNKernel.hpp header. Use path manipulation functions like os.path.dirname,
os.path.join, or pathlib.Path to navigate from the current test file to the
kernel header location relative to the repository root. Apply the same fix to
line 17 which also has a hardcoded absolute path.
- Line 105: The print statement on line 105 uses an f-string prefix but contains
no placeholder variables, which triggers the Ruff F541 lint error. Remove the f
prefix from the string in the print function call since the string is a literal
string with no variables that need interpolation.

---

Nitpick comments:
In `@mllm/backends/cpu/kernels/GDNKernel.hpp`:
- Around line 10-24: The forward method in the GDNKernel struct lacks
documentation of its raw-pointer API contract. Add a comprehensive docstring to
the forward method that specifies the required shapes and memory layouts of all
pointer parameters (s_prev, k, v, gate, beta, s_new, output, q), which pointers
must be non-null and which are optional, the valid range for batch_size, the
semantics of the optional output and q parameters, and any failure modes or
error behavior. This will ensure users understand the memory requirements and
constraints for calling this public API correctly.
- Around line 5-6: The GDNKernel.hpp file allocates a std::vector in the
per-head kernel loop (lines 47-53 and similar locations), which causes heap
allocation overhead in a CPU-intensive hot path. Since D is constexpr, replace
the std::vector<T> proj(D, 0) allocations with a stack-allocated std::array<T,
D> instead. This eliminates the allocator overhead while maintaining the same
functionality, as the size is known at compile time. Ensure all instances of
this pattern throughout the kernel loop are updated to use the stack-allocated
alternative.

In `@mllm/backends/cpu/ops/test_gdn_kernel.cpp`:
- Around line 47-58: The main function in the test calls gdn_forward with test
inputs but never validates the output array s_new, so the test always succeeds
regardless of correctness. Add validation checks after the gdn_forward call that
compare the computed values in s_new against expected results using assertions
or similar checks. This will ensure regressions in the gdn_forward function are
caught when the output differs from expected values.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: aafc0fc2-4319-4c0e-9e1c-9f3b77f55371

📥 Commits

Reviewing files that changed from the base of the PR and between b9c3070 and 8c6425f.

📒 Files selected for processing (5)

mllm/backends/cpu/kernels/GDNKernel.hpp
mllm/backends/cpu/ops/GDNOp.cpp
mllm/backends/cpu/ops/GDNOp.hpp
mllm/backends/cpu/ops/test_gdn_kernel.cpp
tests/cpu/test_gdn_correctness.py

coderabbitai · 2026-06-21T14:28:05Z

+    template <typename T>
+    static void forward(
+        const T* s_prev, const T* k, const T* v, 
+        const T* gate, const T* beta,
+        T* s_new, 
+        float* output = nullptr, const float* q = nullptr, // 👈 改成 float*
+        int batch_size = 1


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Description: List GDNKernel::forward call sites and explicit template arguments. # Expectation: Existing call sites should all use float if this kernel is intended to remain float-only. rg -nP -C3 'GDNKernel::forward\s*(<[^>]+>)?\s*\('

Repository: UbiquitousLearning/mllm

Length of output: 1197

🏁 Script executed:

cat -n mllm/backends/cpu/kernels/GDNKernel.hpp | sed -n '15,80p'

Repository: UbiquitousLearning/mllm

Length of output: 3141

🏁 Script executed:

rg -n 'GDNKernel' --type hpp --type cpp --type py | grep -E '(forward|::)' | head -20

Repository: UbiquitousLearning/mllm

Length of output: 95

🏁 Script executed:

rg -n 'GDNKernel' --type-add 'hpp:*.{hpp,h}' --type hpp --type cpp --type py

Repository: UbiquitousLearning/mllm

Length of output: 637

Fix template parameter mismatch in GDNKernel::forward.

The forward<T> function signature declares output and q as float* and const float* (line 23), but uses them as T* and const T* at lines 72–73. This creates undefined behavior if instantiated with T other than float. All current call sites (test_gdn_correctness.py:22 and GDNOp.hpp:30) use float exclusively.

Choose one approach:

Remove the template and declare the kernel as float-only, or

Change output and q parameters to T* and const T* to match the template contract.

Option 2 fix

- float* output = nullptr, const float* q = nullptr, + T* output = nullptr, const T* q = nullptr,

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@mllm/backends/cpu/kernels/GDNKernel.hpp` around lines 18 - 24, The forward method in the GDNKernel template class has a template parameter mismatch where the output parameter is declared as float* and the q parameter is declared as const float*, but they should be T* and const T* respectively to match the template parameter T used throughout the function. Change the output parameter type from float* to T* and the q parameter type from const float* to const T* in the forward method signature to ensure type consistency and avoid undefined behavior when the template is instantiated with types other than float.

coderabbitai · 2026-06-21T14:28:05Z

+        const T* s_prev, const T* k, const T* v, 
+        const T* gate, const T* beta,
+        T* s_new, 
+        float* output = nullptr, const float* q = nullptr, // 👈 改成 float*


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Remove trailing whitespace from the new header.

These lines visibly end with extra spaces in the submitted content.

As per coding guidelines, “No line may end with trailing whitespace.”

Also applies to: 36-38, 69-69

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@mllm/backends/cpu/kernels/GDNKernel.hpp` around lines 20 - 23, Remove all trailing whitespace from the end of the lines in the GDNKernel.hpp file. Specifically, trim any extra spaces at the end of the parameter declaration lines in the function signature (around the s_prev, k, v, gate, beta, s_new, output, and q parameters), and also check lines 36-38 and line 69 for the same issue. Ensure each line ends cleanly without any trailing spaces, as per the coding guidelines.

Source: Coding guidelines

coderabbitai · 2026-06-21T14:28:05Z

+  void forward(const std::vector<Tensor>& inputs, std::vector<Tensor>& outputs) override {
+    // 获取输入输出 Tensor
+    const auto& s_prev = inputs[0];
+    const auto& k = inputs[1];
+    const auto& v = inputs[2];
+    const auto& gate = inputs[3];
+    const auto& beta = inputs[4];
+
+    auto& s_new = outputs[0];
+    int B = s_prev.size(0); // 获取 batch size
+
+    // 调用我们写好的 GDNKernel 纯计算逻辑
+    // 注意：这里使用 .ptr<float>() 获取底层数据指针
+    GDNKernel::forward(
+        s_prev.ptr<float>(), k.ptr<float>(), v.ptr<float>(),
+        gate.ptr<float>(), beta.ptr<float>(),
+        s_new.ptr<float>(),
+        nullptr, nullptr, // output 和 q 暂时不用，传空指针
+        B


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Validate tensor arity, shape, dtype, and layout before raw pointer access.

forward() indexes inputs[0..4]/outputs[0] and then calls a fixed-layout kernel that reads/writes [B,16,128,*] memory using raw float pointers. Add project-standard checks for 5 inputs, 1 output, float dtype, contiguous/layout-compatible tensors, matching B, and expected shapes before calling the kernel.

As per coding guidelines, “Validate inputs for public APIs and critical internal functions” and “Ensure functions that can fail return appropriate error codes or raise exceptions.”

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@mllm/backends/cpu/ops/GDNOp.hpp` around lines 17 - 35, The forward() method in the GDNOp class directly accesses raw float pointers from input and output tensors without validating their correctness. Before calling GDNKernel::forward(), add validation checks to ensure: the inputs vector contains exactly 5 tensors and outputs vector contains exactly 1 tensor, all tensors have float dtype, all tensors are contiguous or layout-compatible, the batch size B extracted from s_prev matches across all input/output tensors, and each tensor has the expected shape (specifically [B,16,128,*] or similar dimensions as per the kernel requirements). Perform these checks after extracting B but before accessing any raw pointers via .ptr<float>() calls.

Source: Coding guidelines

coderabbitai · 2026-06-21T14:28:05Z

+    const auto& v = inputs[2];
+    const auto& gate = inputs[3];
+    const auto& beta = inputs[4];
+


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Remove trailing whitespace on the blank line.

Line 24 visibly contains indentation-only trailing whitespace.

As per coding guidelines, “No line may end with trailing whitespace.”

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@mllm/backends/cpu/ops/GDNOp.hpp` at line 24, Line 24 contains trailing whitespace on a blank line, which violates coding guidelines that prohibit lines from ending with whitespace. Navigate to line 24 in the GDNOp.hpp file and remove all trailing whitespace (spaces or tabs) from that blank line, leaving it completely empty.

Source: Coding guidelines

coderabbitai · 2026-06-21T14:28:05Z

+# ==========================================
+# ⚠️ 请务必替换成你机器上 GDNKernel.hpp 的真实绝对路径！
+# ==========================================
+KERNEL_HEADER_PATH = "/home/cjt/mllm/mllm-main/mllm/backends/cpu/kernels/GDNKernel.hpp"


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use repo-relative header discovery instead of a machine-local absolute path.

Line 9 hardcodes /home/..., which breaks on CI and every other developer machine.

Proposed fix

import torch import time from torch.utils.cpp_extension import load_inline -import os +from pathlib import Path @@ -KERNEL_HEADER_PATH = "/home/cjt/mllm/mllm-main/mllm/backends/cpu/kernels/GDNKernel.hpp" +KERNEL_HEADER_PATH = ( + Path(__file__).resolve().parents[2] + / "mllm" + / "backends" + / "cpu" + / "kernels" + / "GDNKernel.hpp" +) +if not KERNEL_HEADER_PATH.exists(): + raise FileNotFoundError(f"GDN header not found: {KERNEL_HEADER_PATH}") @@ -#include "{KERNEL_HEADER_PATH}" +#include "{KERNEL_HEADER_PATH.as_posix()}"

Also applies to: 17-17

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/cpu/test_gdn_correctness.py` at line 9, The KERNEL_HEADER_PATH variable on line 9 uses a hardcoded absolute path that is machine-specific and breaks on CI and other developer machines. Replace the hardcoded path with a repo-relative path discovery approach by using the test file's location (__file__) to dynamically construct the path to the GDNKernel.hpp header. Use path manipulation functions like os.path.dirname, os.path.join, or pathlib.Path to navigate from the current test file to the kernel header location relative to the repository root. Apply the same fix to line 17 which also has a hardcoded absolute path.

coderabbitai · 2026-06-21T14:28:05Z

+    # 实际吞吐量取决于你的业务定义，这里以 Batch Size 作为每秒处理的 token 数参考
+    tokens_per_second = (B * repeats) / (end_time - start_time)
+
+    print(f"\n📈 Benchmark 结果汇总:")


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Remove redundant f prefix (Ruff F541).

This line is an f-string without placeholders and triggers lint error.

Proposed fix

- print(f"\n📈 Benchmark 结果汇总:") + print("\n📈 Benchmark 结果汇总:")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

print(f"\n📈 Benchmark 结果汇总:")

print("\n📈 Benchmark 结果汇总:")

🧰 Tools

🪛 Ruff (0.15.17)

[error] 105-105: f-string without any placeholders

Remove extraneous f prefix

(F541)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/cpu/test_gdn_correctness.py` at line 105, The print statement on line 105 uses an f-string prefix but contains no placeholder variables, which triggers the Ruff F541 lint error. Remove the f prefix from the string in the print function call since the string is a literal string with no variables that need interpolation.

Source: Linters/SAST tools

Add CPU GDN kernel with correctness test and benchmark

8c6425f

Misscheng-eng requested review from chenghuaWang, oreomaker and yirongjie as code owners June 21, 2026 14:23

coderabbitai Bot reviewed Jun 21, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add CPU GDN kernel for Qwen3.5 with correctness test and benchmark#687

Add CPU GDN kernel for Qwen3.5 with correctness test and benchmark#687
Misscheng-eng wants to merge 1 commit into
UbiquitousLearning:mainfrom
Misscheng-eng:gdn-pr-clean

Misscheng-eng commented Jun 21, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Jun 21, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 21, 2026

Uh oh!

coderabbitai Bot Jun 21, 2026

Uh oh!

coderabbitai Bot Jun 21, 2026

Uh oh!

coderabbitai Bot Jun 21, 2026

Uh oh!

coderabbitai Bot Jun 21, 2026

Uh oh!

coderabbitai Bot Jun 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	print(f"\n📈 Benchmark 结果汇总:")
	print("\n📈 Benchmark 结果汇总:")

Conversation

Misscheng-eng commented Jun 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Implementation

CPU Integration

Testing

Correctness Test

Benchmark Results (CPU)

Scope

Summary

Uh oh!

coderabbitai Bot commented Jun 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 21, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 21, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 21, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 21, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 21, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 21, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Misscheng-eng commented Jun 21, 2026 •

edited

Loading

coderabbitai Bot commented Jun 21, 2026 •

edited

Loading