gpu-kernel

Here are 3 public repositories matching this topic...

meta-pytorch / tritonparse

TritonParse: A Compiler Tracer, Visualizer, and Reproducer for Triton Kernels

debugging compiler pytorch triton structured-logging interactive-visualization ir-analysis gpu-kernel ir-visualization

Updated Jun 26, 2026
Python

An end-to-end agent project for GPU kernel implementation, analysis, profiling, and iterative optimization. It helps an agent turn PyTorch logic or an existing kernel into a high-performance GPU kernel through a structured, profile-driven workflow.

agent skills gpu nvidia-gpu gpu-programming amd-gpu agentic-workflow self-improving-agent gpu-kernel autoresearch kernel-generation

Updated Jun 27, 2026
Python

soncel15 / rocm-kernel-lab

Star

Production-grade HIP kernel optimization lab — matrix ops, reductions, shared memory patterns

hpc amd optimization hip rocm gpu-kernel

Updated Jun 12, 2026
HIP

Improve this page

Add a description, image, and links to the gpu-kernel topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the gpu-kernel topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly