Add FFT testbench#59
Merged
Merged
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
Adds a new FFT testbench harness and a Tenstorrent-inspired dataflow case-study demo, while extending Zeonica’s emulator/driver stack with a new TT_MATMUL_TILE_U32 opcode and host-side per-core data token injection/collection APIs.
Changes:
- Introduce a 256-point FFT testbench (runtime config, harness, and Go test) that validates shared-memory behavior against a source-level integer FFT model.
- Add a TT-Metalium-inspired multicore matmul + eltwise-add experimental case study (docs, kernels, demo harness, optional trace-summary output).
- Extend core/emulator + driver + tile interfaces to support
TT_MATMUL_TILE_U32and host drain/inject ofcgra.Datatokens.
Reviewed changes
Copilot reviewed 34 out of 35 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| test/testbench/fft/main.go | FFT harness: loads YAML program, preloads shared memory, runs, and checks results vs CPU model. |
| test/testbench/fft/fft_test.go | Go test wrapper around the FFT harness. |
| test/testbench/fft/arch_spec.yaml | Arch spec for FFT testbench (shared-memory mode + trace logging). |
| experiments/tenstorrent_dataflow_case_study/tt_metal_sources/README.md | Doc placeholder for TT-Metal source notes policy. |
| experiments/tenstorrent_dataflow_case_study/tt_metal_sources/matmul_multicore_source_note.md | Extracted dataflow pattern notes for TT multicore matmul (no vendored source). |
| experiments/tenstorrent_dataflow_case_study/TODO.md | Case-study TODO/status tracking. |
| experiments/tenstorrent_dataflow_case_study/results/summary.md | Summary of the case study’s supported/unsupported claims + current result. |
| experiments/tenstorrent_dataflow_case_study/results/README.md | Results directory placeholder. |
| experiments/tenstorrent_dataflow_case_study/results/matmul_multicore_trace_summary.md | Example/record of lightweight lifecycle trace summary output. |
| experiments/tenstorrent_dataflow_case_study/README.md | Case study overview, scope, and reproduction command. |
| experiments/tenstorrent_dataflow_case_study/matmul_multicore_demo/main.go | Multicore matmul demo using TT_MATMUL_TILE_U32 and per-core token feed/collect. |
| experiments/tenstorrent_dataflow_case_study/lowering_notes/README.md | Lowering-notes directory placeholder. |
| experiments/tenstorrent_dataflow_case_study/lowering_notes/matmul_multicore_dataflow_mapping.md | Manual lowering boundary + mapping rules for the matmul demo. |
| experiments/tenstorrent_dataflow_case_study/kernels/matmul_2x2/README.md | Kernel artifact documentation for the full-size multicore matmul demo. |
| experiments/tenstorrent_dataflow_case_study/kernels/matmul_2x2/matmul_multicore.yaml | 4x4 per-PE YAML program invoking TT_MATMUL_TILE_U32. |
| experiments/tenstorrent_dataflow_case_study/kernels/eltwise_add/README.md | Documentation for minimal eltwise-add producer/consumer demo. |
| experiments/tenstorrent_dataflow_case_study/kernels/eltwise_add/eltwise_add.yaml | 1x1 YAML program for ADD West/North -> East. |
| experiments/tenstorrent_dataflow_case_study/figures/README.md | Figures directory placeholder + policy. |
| experiments/tenstorrent_dataflow_case_study/figures/matmul_multicore_visuals.md | Mermaid figure sources + visual tables for the case study. |
| experiments/tenstorrent_dataflow_case_study/eltwise_add_demo/main.go | Minimal demo harness for eltwise-add kernel. |
| experiments/tenstorrent_dataflow_case_study/DESIGN.md | Design + fidelity boundary + evidence chain for the case study. |
| experiments/tenstorrent_dataflow_case_study/configs/README.md | Configs directory placeholder. |
| experiments/tenstorrent_dataflow_case_study/CLAIM_BOUNDARY.md | Explicit claim boundary (what’s supported vs what must not be claimed). |
| experiments/tenstorrent_dataflow_case_study/CASE_STUDY_TEXT.md | Draft paper-ready case-study narrative. |
| core/tt_matmul_tile_test.go | Unit tests for ttMatmulTileU32 (single ktile, 2 ktiles, kt=20 shape). |
| core/opcodes.go | Register TT_MATMUL_TILE_U32 as a vector opcode (plus formatting updates). |
| core/emu.go | Implement TT_MATMUL_TILE_U32 and add host-drain tracking; modifies PHI_START behavior. |
| core/core.go | Add host inject/drain methods and skip network send on host-drained directions. |
| core/builder.go | Initialize HostDrainDirections in core state. |
| config/platform.go | Plumb new tile APIs through platform tile wrapper. |
| cgra/cgra.go | Extend cgra.Tile interface with inject/drain/host-drain controls. |
| api/mock_cgra_test.go | Update gomock tile mock for new tile methods. |
| api/driver.go | Add per-core cgra.Data feed/collect tasks and APIs. |
| api/data_token_test.go | Test that per-core feed/collect carries a vector token end-to-end. |
Files not reviewed (1)
- api/mock_cgra_test.go: Language not supported
Comments suppressed due to low confidence (1)
core/emu.go:3493
runPhiStartcurrently allows both source predicates to be true, in which case it will silently prefersrc1and ignoresrc2. This breaks the mutual-exclusion invariant enforced byrunPhiand can hide control-flow bugs by producing non-deterministic/incorrect PHI selection instead of failing fast.
src2Struct := i.readOperand(src2, state) // only in normal path will consume src2
src2Val := src2Struct.First()
src2Pred := src2Struct.Pred
if src1Pred {
result = src1Val
finalPred = src1Pred
} else { // src2Pred is true or both are false(arbitrary)
result = src2Val
finalPred = src2Pred
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
5820166 to
f96ed0e
Compare
Jackcuii
approved these changes
Jun 10, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary\n- Add an FFT testbench that validates Zeonica shared-memory output against a source-level CPU golden model.\n- Update the Zeonica_Testbench submodule pointer to the FFT generated artifacts commit.\n- Skip the FFT test in clean CI checkouts when the submodule artifact is unavailable.\n\n## Validation\n- go test -count=1 ./test/testbench/fft\n- golangci-lint run --timeout=10m ./test/testbench/fft