Skip to content
View hamidmatiny's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report hamidmatiny

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
hamidmatiny/README.md

Mohammadreza Matiny

AI Software & MLOps Engineer · Production Deep Learning · Low-Latency Async APIs

Profile Views GitHub followers


About Me

Hybrid engineer bridging deep learning mathematics and production infrastructure.

I build, optimize, and deploy production-grade deep learning systems and low-latency distributed data pipelines — with 6+ years across core software engineering, computer vision, data science, and ML infrastructure. Currently, I bring this experience to Torc Robotics as a Quality Assurance and Annotation Specialist, verifying high-fidelity data streams for autonomous vehicle environments. I specialize in distributed compute scheduling, strict data contract enforcement, and localized hardware performance optimization, turning complex multi-modal data streams into robust, stable, and highly scalable production networks.


Core Expertise

MLOps & Deployment

Docker Terraform AWS GCP FastAPI MLflow GitHub Actions

Deep Learning & Core ML

Python PyTorch Hugging Face Sequence Models Sensor Fusion

Data & Distributed Systems

Ray Pandera SQL Pandas Pytest PyArrow Parquet Data Pipelines

Optimization & Acceleration

GPU Acceleration Mixed Precision torch.compile


Featured Architectural Builds

A high-performance distributed data gatekeeper engineered for streaming machine learning and data pipelines. It intercepts production data feeds to mitigate statistical anomalies, structural degradation, and data drift. The core engine paralyzes data processing workloads via asynchronously scaled Ray Tasks and stateful Actors, verifying every batch against programmatic constraints built on Pandera schema contracts.

Ray Core · Pandera · Distributed Validation · Data Quality Gates · Docker


Enterprise-grade data infrastructure automation ecosystem focusing on schema safety and decoupled system deployments. Leverages strict behavioral regression contracts engineered through automated Pytest testing frameworks. Fully orchestrates cloud infrastructure footprints utilizing Terraform (Infrastructure as Code) mapped seamlessly onto target AWS environments.

Terraform · AWS Infrastructure · Pytest · Schema Contracts · Continuous Delivery


AI-driven asynchronous travel engine built on search-then-synthesize architecture. Integrates xAI's Grok API with async Python to manage concurrent data streams, parallel I/O-bound LLM calls, and low-latency task scheduling — delivering hyper-personalized itineraries without blocking the request path.

FastAPI · AsyncIO · xAI Grok · Streamlit · Search-then-Synthesize


Asynchronous ML serving infrastructure on GCP using FastAPI and containerized Docker environments. Evolved from custom ResNet-18 to Hugging Face ViT transfer learning with production-oriented inference paths — achieving a 40% reduction in production latency through async request handling and optimized model serving.

Vision Transformer · FastAPI · Docker · GCP · Hugging Face


Localized hardware optimization routines leveraging mixed-precision (AMP), torch.compile, and advanced DataLoader tuning across CUDA and Apple MPS backends. Systematic profiling and kernel-level tuning deliver 2–4× inference speedups with ~60% memory reduction on constrained hardware.

PyTorch AMP · torch.compile · CUDA · MPS · DataLoader Tuning


Engineering Philosophy

Production-quality code is resource-efficient by design — not optimized as an afterthought. I treat messy multi-modal sensor data and massive data-streaming pipelines as first-class systems problems: ingest, distribute, validate, and serve with the same exact rigor applied to model architecture. Success is measured by deployment metrics — p99 latency, compute node efficiency, structural schema integrity, and overall pipeline resiliency — not theoretical benchmarks that never survive contact with production data streams.

Pinned Loading

  1. lstm-attention-transformers lstm-attention-transformers Public

    Implementations of neural sequence models including RNNs, LSTMs, and Transformers. Includes attention weight visualization, Seq2Seq frameworks, and fine-tuning workflows for NLP tasks.

    Jupyter Notebook

  2. cuda-optimization cuda-optimization Public

    Complete guide to optimizing deep learning models with PyTorch. Adapted for M3 MacBook with Metal Performance Shaders (MPS)

    Jupyter Notebook

  3. itinera itinera Public

    Hyper-personalized travel itineraries powered by xAI Grok, with a FastAPI backend and Streamlit frontend.

    Python

  4. hydra-data-factory hydra-data-factory Public

    Production-grade, serverless AWS data pipeline simulating large-scale autonomous-vehicle telemetry-processing at fleet scale. This repository demonstrates end-to-end ingestion, Distributed Stream T…

    Python

  5. sentinel-ray sentinel-ray Public

    Distributed camera telemetry ingestion with Ray Core, Pandera QA validation, statistical drift detection, and automated incident orchestration.

    Python

  6. vanguard-telemetry-monitor vanguard-telemetry-monitor Public

    A production-pattern vehicle telemetry simulation and observability platform built in four incremental phases. Vanguard generates realistic fleet telemetry, injects production-like anomalies, expos…

    Python