Real-time Traffic Sign Detection for Autonomous Driving A progressive research and implementation journey — from a CNN baseline to Faster R-CNN with anchor tuning, and ultimately YOLOX trained for 100 epochs on a custom-curated Indian traffic sign dataset.
This project investigates deep learning approaches for robust traffic sign detection, a safety-critical component of autonomous driving systems. The work spans multiple phases of experimentation — each phase building on the limitations discovered in the previous one — culminating in a YOLOX-based detector achieving state-of-the-art performance on the custom dataset.
Deep-Sign-Vision/
├── phase1_mini_project/ # Baseline: CNN + initial Faster R-CNN exploration
├── phase2_data_preprocessing/ # Dataset curation, filtering, bbox conversion, resizing
├── phase3_faster_rcnn_experiments/ # Systematic Faster R-CNN experiments (128×128)
├── phase4_faster_rcnn_256x256/ # Faster R-CNN scaled to 256×256 resolution
├── phase5_yolox/ # Final model: YOLOX trained for 100 epochs
├── results/ # Confusion matrices and evaluation outputs
└── report/ # Full project report (Pre-Final)
- Built a custom Convolutional Neural Network for traffic sign classification
- Established a performance baseline for downstream comparison
- Converted and prepared the initial dataset for model training
- Bounding box conversion from Roboflow COCO format to PyTorch-compatible format
- Image resizing to 128×128 for efficient training
- Class distribution analysis to identify class imbalance across 20+ sign categories
- Intelligent data filtering — retained only classes with ≥100 annotated instances to reduce noise and improve convergence
A systematic ablation study across five configurations:
| Experiment | Description |
|---|---|
| Single Class (Crosswalk) | Isolated the highest-frequency class to validate the detection pipeline |
| Two Class | Extended to top-2 classes, tested multi-class generalisation |
| Filtered Dataset — 50 Epochs | Full filtered class set, extended training |
| Anchor Tuning | Custom anchor scales/ratios matched to traffic sign aspect ratios |
| Final — 30 Epochs | Consolidated best configuration from above experiments |
- Scaled input resolution to 256×256 to capture finer spatial detail
- Retrained with the optimal configuration from Phase 3
- Evaluated mAP improvement vs. the 128×128 baseline
- Implemented YOLOX-S, a single-stage anchor-free detector
- Converted the filtered dataset to COCO format for YOLOX compatibility
- Trained for 100 epochs on Google Colab with GPU acceleration
- Achieved the strongest detection performance across all experiments
| Model | Resolution | Epochs | Notes |
|---|---|---|---|
| CNN (Baseline) | 128×128 | — | Classification only |
| Faster R-CNN | 128×128 | 30 | Multi-class detection baseline |
| Faster R-CNN (Anchor Tuned) | 128×128 | 50 | Custom anchors for sign aspect ratios |
| Faster R-CNN | 256×256 | 30 | Higher resolution input |
| YOLOX-S | 640×640 | 100 | Best performing model |
Confusion matrices for key experiments are available in the results/ directory.
- Source: Roboflow — Indian Traffic Sign Dataset
- Preprocessing: Resized to 128×128 and 256×256, filtered to classes with ≥100 instances
- Format: COCO JSON for YOLOX; CSV + custom PyTorch Dataset for Faster R-CNN
- Classes: 20+ Indian traffic sign categories including crosswalk, stop, speed limit, and more
The raw image dataset is not included in this repository due to size constraints. Notebooks reference dataset paths from Google Drive / Colab uploads.
- Frameworks: PyTorch, Torchvision, YOLOX
- Environment: Google Colab (GPU)
- Data Management: Roboflow, pandas, PIL
- Evaluation: COCO API, pycocotools, confusion matrices
The complete project report including literature review, methodology, results, and analysis is available in report/Major_Project_2_Report.pdf.