AI Full-stack Engineer & Embodied AI Developer
LLM Deployment · AI Agents · Computer Vision · Edge Computing
📍 Shenzhen, China · 🧠 Focus on AI-driven hardware & software integration
Dedicated to AI Model Commercialization & Embodied AI Development
I don't just call APIs; I build platform-level loops from edge hardware to cloud LLMs.
🔧 What I Do (Expand)
- LLM Infrastructure: Deployment and optimization using vLLM and FastGPT.
- Autonomous Agents: Designing multi-skill agents based on the MoltBot framework.
- Edge CV: Real-time vision tasks on NVIDIA Jetson and Rockchip platforms.
- System Integration: Bridging the gap between digital intelligence and physical control.
🧠 Core Skills (Expand)
- LLM Deployment: vLLM, Qwen, DeepSeek, model quantization.
- RAG & Knowledge: FastGPT, vector database integration.
- Agent Frameworks: Tinbot development, Browser Automation, Task Planning.
- Models: YOLOv10, real-time object detection.
- Acceleration: TensorRT C++ deployment, model distillation.
- Edge Platforms: NVIDIA Jetson Orin NX, RK3588S.
- Embedded: ESP32-S3, UART/I2C/SPI interface protocols.
- DevOps: Docker, Linux system optimization, CI/CD pipelines.
🌟 Key Projects (Expand)
End-to-End Robotics Software Architecture · Core Developer
A fully integrated physical robot seamlessly combining voice interaction, vision, and mechanical control. (Closed-source company product)
- Intelligent Workflow: Engineered a closed-loop system: Voice Wake-up -> ASR -> Agent Dispatch -> Robot Execution -> TTS Broadcast.
- Vision & Grasping: Utilized Multi-modal LLMs, FastSAM + TensorRT for real-time segmentation, and depth camera inverse kinematics for precise object grasping.
- Cognitive & Navigation: Integrated FastGPT for online knowledge base dialogues and empowered the robot with autonomous navigation tasks.
Autonomous Multi-Skill AI Agent · Lead Developer
Bridging browser automation and local OS control.
- Built on MoltBot architecture to handle high-complexity, multi-step tasks.
- Integrated native browser automation for dynamic data retrieval.
High-Performance Local RAG Deployment Solution · Creator
A one-stop deployment architecture solving VRAM OOM and network constraints.
- Inference & Search: Built with vLLM (Qwen2.5-7B) and TEI (BGE-M3) to achieve high-concurrency and low-latency queries.
- VRAM Optimization: Fine-tuned model parameters to stably co-host LLM and Embedding models on a single GPU.
- Network & Access: Customized for domestic network environments (HF/Docker acceleration) and integrated FRP + Nginx for secure public access to intranet services.
🏢 Background & Focus (Expand)
- AI R&D Engineer in a Shenzhen-based startup.
- Focused on full-stack AI implementation from data annotation to deployment.
- Embodied AI: Bringing Large Multi-modal Models (LMM) into physical hardware.
- System Efficiency: Maximizing AI performance on resource-constrained edge devices.
- Developer Experience: Building reusable AI components and automated workflows.
- Email: t1anhu4w@gmail.com
- GitHub: https://github.com/T1anhu4