A lightweight experimental web service for studying how bounded queues, worker capacity, and request pressure interact to produce overload, rejection, and recovery behaviour.
This project implements a small HTTP service that accepts incoming work requests and processes them using a limited worker pool.
The system is intentionally capacity‑constrained so that overload conditions can be observed and measured.
It is designed as an experimental playground for questions like:
- What happens when requests arrive faster than they can be processed?
- How does bounded queue capacity protect a system?
- When does overload begin to cause request rejection?
- How does recovery occur after pressure subsides?
Rather than focusing on features, this project focuses on systems behaviour under stress.
- Admission control — requests are rejected when the queue is full
- Backpressure — the system protects itself instead of accepting infinite work
- Bounded queues — memory and latency growth are constrained
- Worker pools — fixed processing capacity
- Overload dynamics — behaviour changes under burst pressure
- Recovery behaviour — queues drain once load reduces
- Service-oriented design
- Background worker threads
- Runtime metrics collection
- Controlled load testing
- Repeatable experiments
- CSV result logging
- Behaviour visualisation
Incoming Requests
↓
Admission Check
↓
Bounded Work Queue
↓
Worker Pool
↓
Job Processing
If the queue is full:
New requests are rejected to prevent overload amplification.
- FastAPI-based HTTP service
- Configurable queue capacity
- Configurable worker count
- Configurable job processing time
- Background worker threads
- Live metrics endpoint
- Metrics reset endpoint
- Automated load test runner
- CSV experiment output
- Matplotlib experiment plots
| Endpoint | Method | Purpose |
|---|---|---|
/ |
GET | Health check |
/submit |
POST | Submit a job request |
/metrics |
GET | View runtime metrics |
/reset |
POST | Reset metrics counters |
python3 -m venv venv
source venv/bin/activatepip install -r requirements.txtuvicorn app.server:app --reloadOpen API docs:
http://127.0.0.1:8000/docs
In a second terminal (with venv active):
python scripts/load_test.pyThis runs controlled experiments using different request rates and records:
- accepted requests
- rejected requests
- processed jobs
- acceptance rate
Results are saved to:
results/load_experiment_results.csv
python scripts/plot_results.pyGenerated plots:
results/acceptance_rate_vs_delay.pngresults/rejected_requests_vs_delay.png
As request delay decreases (higher arrival rate):
- Acceptance rate drops
- Rejected requests increase
- The system reaches a capacity ceiling
This demonstrates classic overload behaviour:
Beyond a certain load, additional pressure increases rejection rather than throughput.
Because queue size is bounded:
- Work is rejected instead of accumulating indefinitely
- Latency and memory growth are constrained
- The system remains responsive under stress
After burst pressure ends:
- Workers continue draining queued jobs
- Queue depth returns to zero
- System stabilises without manual intervention
The project produces:
- Experiment summary tables
- CSV result datasets
- Acceptance rate plots
- Rejection rate plots
load-backpressure-service/
├── app/
│ ├── server.py
│ ├── queue_manager.py
│ ├── workers.py
│ └── metrics.py
├── scripts/
│ ├── load_test.py
│ └── plot_results.py
├── results/
│ ├── *.csv
│ └── *.png
├── requirements.txt
└── README.md
This project is part of a systems engineering portfolio focused on:
- System stability
- Capacity limits
- Performance under load
- Resilient service design
It complements a companion project that models queue and scheduling behaviour via simulation.
Possible next steps:
- Rate limiting strategies
- Retry and timeout behaviour
- Priority queues
- Dynamic worker scaling
- Circuit breaker patterns
- Tail latency tracking
- Distributed worker processes
MIT

