Building production AI systems at the intersection of agent evaluation, LLM infra, and applied ML.
Based in San Jose, CA · Open to collaboration on AI tooling, agent frameworks, and dev infrastructure.
I design and ship AI agent systems — from multi-tier memory architectures to adversarial evaluation pipelines. My work focuses on making LLM agents more reliable, observable, and production-ready.
- Agent Infrastructure — SageMem: GPU-inspired L1/L2/L3/DRAM memory hierarchy for multi-agent systems using Redis + pgvector
- Agent Observability — playback: Session log visualizer that turns raw JSONL agent traces into step-by-step timelines
- Adversarial Evaluation — gauntlet: CLI tool spawning 6 AI personas to red-team PRDs and codebases for weaknesses
- Developer Tooling — claude-deck: Claude Code statusline showing context %, model, API usage, and Spotify track
- Applied AI — NFL-contract-coach: AI platform reviewing sports contracts for predatory clauses across 10 risk categories
- Embodied AI — reachy-rx: Pharmacist robot on Reachy-mini (3rd place, Seeed Studio hackathon)
Languages: Python · TypeScript · JavaScript · Shell · C++
AI/ML: LangChain · LlamaIndex · DSPy · PyTorch · Ray · FastAPI · MCP · OpenAI · Anthropic · Gemini
Infra: Redis · PostgreSQL + pgvector · AWS (ECS, S3, CloudFront, DynamoDB) · Docker
Frontend: React · Next.js · Tailwind CSS · Convex · Liveblocks
→ GenAI_notebooks — curated notebooks on LLM fine-tuning, DSPy, NVIDIA NeMo, and agent evaluation


