Ibrahim Salman Β· [devhms](https://github.com/devhms)
CS @ UET Taxila. I build production systems β obfuscation engines,
OCR pipelines, WhatsApp automation, RAG chatbots, offline STT apps.
Always shipping.
Java, Python, JavaScript Β· SLSA L3 Β· Sigstore Cosign Β· OpenSSF Scorecard
The only open-source LLM anti-scraping tool for Java. Applies 8 adversarial strategies (variable scrambling, dead code injection, control flow flattening, watermark embedding, comment poisoning, Unicode homoglyphs, opaque predicates, string splitting) to source code before public release β code compiles and runs identically, but degrades LLM training quality if scraped.
- Evades MinHash/LSH deduplication via low-entropy perturbation
- SLSA Level 3 supply chain with Sigstore Cosign signatures, CodeQL, Dependabot
- CLI + GitHub Action + pre-commit hook + JavaFX GUI
- Full strategy pipeline: tokenizer β poisoner β verifier
2 stars Β· nightshade Β· mvn package
Python, EasyOCR, Streamlit Β· 100% test coverage Β· Forensic audit
High-performance deterministic OCR with 3-Layer A.N.T. Architecture (Architect β Navigator β Tools). Extracts text from PDFs (Poppler), PPTX slides, and images with self-healing retries and exponential backoff.
- 160+ tests, 100% branch coverage β CI-enforced, verified by coverage report
- Parallel threaded processing for multi-page documents
- CLI + Streamlit GUI dashboard with job history, analytics, JSON output
- Forensic audit: 17 critical vulnerabilities remediated (XXE, SQLi, thread isolation)
- Configurable via JSON/YAML, designed for unattended batch runs
3 stars Β· OCR
TypeScript, Python, Next.js 15, Convex, Clerk, Vector Search
Authoritative AI assistant for UET Taxila β crawls campus info, syllabi, and official docs, chunks them into Convex vector search, answers via LLM with citation.
- Custom async BFS crawler (curl_cffi, trafilatura) β chunkMarkdown β embeddings
- Convex vector search + RRF reranking + anti-hallucination guardrails
- Tiered rate limiting (anon/user/admin) via Clerk auth
- 50-pair golden eval dataset with automated evaluation harness
- Freshness cron job, Sentry error tracking, Neo Kinpaku design
- 13 key Python + TypeScript crawlers for comprehensive campus data
4 stars Β· uet_gpt
Python, Gemini AI, WhatsApp, Circuit Breaker, Smart Cache
Aggregates news from Google RSS, NewsAPI, GNews β deduplicates via cosine similarity clustering β Gemini AI summarizes β delivers formatted WhatsApp reports.
- Thread-safe circuit breaker pattern (CLOSED β OPEN β HALF-OPEN)
- Smart file-based cache reduces API calls ~60%
- Dashboard (Chart.js) with health reports, analytics, delivery tracking
- Optional TTS video briefings (pyttsx3 + moviepy)
- Exponential backoff retry, rate limiter, content scraper for full-article extraction
DailyNewsBot Β· DailyAutomationSystem
Python, Selenium, Google Sheets API Β· Production, 4 bots
Production-grade automation bridging Google Sheets to WhatsApp. Runs daily for jamiat group operations:
- Submission Bot β atomic JSON delivery journal, single-instance lock, dual persistent profiles
- Ghost Hunter β cross-references sheet submissions against member list, flags missing reports
- Red Flag Scanner β anomaly detection on Salah records and discipline entries
- Reminder Bot β scheduled reminders with phone-first delivery + search fallback
Idempotent, self-healing selectors, heartbeat telemetry, failure screenshots β designed for unattended 24/7 operation.
Python, BeautifulSoup
Lightweight Dawn News scraper β fetches headlines, topic filters, CSV export.
Python, Bezier Splines, Augraphy, OCR Benchmarking
Academic-grade synthetic handwriting pipeline generating high-fidelity handwritten documents from digital text:
| Phase | What | Tech |
|---|---|---|
| 1 | PakE dialect modeling + stochastic typo injection | NLP augmentation, sociolinguistic corpus |
| 2 | Bezier spline motor-path synthesis + ink rendering | 15-feature enhancement (fatigue, rotation, bleed, pressure, snake drift etc.) |
| 3 | Document aging + 3D photo-realism | Augraphy + ISO noise + sub-pixel jitter |
| 4 | Telemetry + OCR benchmark sweep | CER/WER with bootstrap confidence intervals |
Referenced research: DiffInk ICLR 2026, InkSpire ICLR 2026, DiffusionPen ECCV 2024.
Rust, TypeScript, Tauri v2 Β· Whisper Β· Parakeet Β· Silero VAD
Cross-platform desktop app for fully offline speech transcription. Press a hotkey β speak β text appears in any application β zero cloud dependency.
- Whisper (Small/Medium/Turbo/Large) + GPU acceleration via CTranslate2
- Parakeet V3 β CPU-optimized with auto language detection
- Silero VAD for voice activity/silence filtering
- Global hotkeys + system tray (Windows/macOS/Linux)
- Managed fork of Handy with 10+ CI workflows, Playwright E2E, WER benchmarking
5 stars Β· Zuban
| Version | Stack | Highlights |
|---|---|---|
| v1 Portfolio | Next.js 14, R3F, GSAP, React Spring, Zustand, Lenis | 3D matcap scene, magnetic cursor physics, terminal CLI simulator, smooth scroll |
| v2 portfolio_final | Next.js 15, Framer Motion, Vercel AI SDK + Groq | AI assistant chat, next-themes, strict TypeScript, security headers, Turbopack |
TypeScript, React, Remotion, FFmpeg
Programmatic video production for Islami Jamiat-e-Talaba β cinematic promo with 14+ animated scenes, virtual camera, film grain, color grading, Pakistan map visualization, audio mixer, and sub-organization hierarchy.
TypeScript, Next.js, Groq, shadcn/ui
Open-source AI chatbot template with streaming responses + tool integration (weather, components). One-click Vercel deploy. Forked from Vercel AI SDK.
| Domain | Tools |
|---|---|
| Languages | Java (Maven), Python, TypeScript, Rust, JavaScript, C++ |
| Security | SLSA, Sigstore Cosign, CodeQL, OpenSSF Scorecard, OWASP |
| AI/ML | Gemini, EasyOCR, Whisper, Parakeet, PyTorch, DiffInk |
| Infrastructure | Docker, GitHub Actions, Vercel, Convex, Clerk, Supabase |
| Desktop | Tauri v2, JavaFX, Streamlit, Selenium |
| Creative | Three.js, R3F, GSAP, Framer Motion, Remotion |
- π Building Nightshade v2 β expanding strategy count + WASM frontend for browser-side obfuscation
- π Publishing the PakE OCR Corpus paper β synthetic handwriting for Urdu/English low-resource OCR
- π± Deep-diving Rust systems programming + vector search internals
- π― Open to collaboration on LLM security research and open-source OCR tooling
- π¬ Ask me about: code obfuscation, RAG pipelines, WhatsApp automation, offline STT
"A computer can do it β I just have to tell it how."