LLM Gateway

A unified API gateway for multiple Large Language Model (LLM) providers, providing a single endpoint to access various AI models from Gemini, Groq, Mistral, and Nvidia.

Features

Unified API: Single /v1/chat/completions endpoint for all supported providers
Intelligent Routing: Automatically routes requests to the appropriate provider based on model name
Response Caching: Redis-based caching to reduce latency and API costs
Authentication: Secure API key-based authentication
Request Logging: PostgreSQL database for tracking usage and costs
Rate Limiting: Built-in middleware for request throttling (configurable)
Docker Support: Easy deployment with Docker Compose

Supported Providers & Models

Provider	Models	Base URL
Gemini	`gemini-1.5-flash`, `gemini-1.5-pro`	Google AI
Groq	`llama-3.1-70b`, `mixtral-8x7b`, `gemma-7b`	Groq Cloud
Mistral	`mistral-large`, `mistral-medium`	Mistral AI
Nvidia	`deepseek-v3`, various Nvidia models	Nvidia AI

Quick Start

Prerequisites

Docker and Docker Compose
API keys for desired providers

1. Clone and Setup

git clone <your-repo-url>
cd llm-gateway

2. Environment Configuration

Create a .env file in the root directory:

# Provider API Keys
GEMINI_API_KEY=your_gemini_api_key
GROQ_API_KEY=your_groq_api_key
MISTRAL_API_KEY=your_mistral_api_key
NVIDIA_API_KEY=your_nvidia_api_key

# Gateway Configuration
GATEWAY_API_KEY=your_gateway_secret_key

3. Launch with Docker

docker-compose up --build

The gateway will be available at http://localhost:8000

Local Development

Install Dependencies

pip install -r requirements.txt

Run Locally

# Start Redis (if not using Docker)
redis-server

# Start PostgreSQL (if not using Docker)
# Configure your local database

# Run the application
python main.py

API Usage

Authentication

All requests require the x-gateway-key header:

curl -X POST "http://localhost:8000/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -H "x-gateway-key: your_gateway_secret_key" \
  -d '{
    "model": "gemini-1.5-flash",
    "messages": [
      {"role": "user", "content": "Hello, world!"}
    ]
  }'

Request Format

The gateway accepts standard OpenAI-compatible chat completion requests:

{
  "model": "gemini-1.5-flash",
  "messages": [
    {"role": "user", "content": "Your message here"}
  ],
  "temperature": 0.7,
  "max_tokens": 100
}

Model Routing

The gateway automatically routes based on model names:

gemini-* → Gemini API
llama-*, mixtral-*, gemma-* → Groq API
mistral-* → Mistral API
nvidia-*, deepseek-* → Nvidia API

Architecture

┌─────────────────┐    ┌─────────────────┐
│   Client Apps   │────│   LLM Gateway   │
└─────────────────┘    └─────────────────┘
                                │
                ┌───────────────┼───────────────┐
                │               │               │
        ┌───────▼──────┐ ┌──────▼──────┐ ┌─────▼─────┐
        │   Redis      │ │ PostgreSQL  │ │ Providers  │
        │   Cache      │ │   Logs      │ │  APIs      │
        └──────────────┘ └─────────────┘ └───────────┘

Database Schema

Users Table

CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    api_key VARCHAR(255) UNIQUE NOT NULL,
    name VARCHAR(100),
    balance_usd DECIMAL(10, 4) DEFAULT 0.0
);

Request Logs Table

CREATE TABLE request_logs (
    id SERIAL PRIMARY KEY,
    user_id INTEGER REFERENCES users(id),
    provider VARCHAR(50),
    model_name VARCHAR(100),
    prompt_tokens INTEGER,
    completion_tokens INTEGER,
    total_cost DECIMAL(10, 6),
    status_code INTEGER,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

Configuration

Environment Variables

Variable	Description	Required
`GEMINI_API_KEY`	Google Gemini API key	No
`GROQ_API_KEY`	Groq API key	No
`MISTRAL_API_KEY`	Mistral AI API key	No
`NVIDIA_API_KEY`	Nvidia API key	No
`GATEWAY_API_KEY`	Gateway authentication key	Yes

Cache Configuration

Cache TTL: 1 hour (3600 seconds)
Cache key: SHA256 hash of model + messages

Development

Project Structure

llm-gateway/
├── main.py          # FastAPI application and routing logic
├── middleware.py    # Authentication and rate limiting
├── database.py      # Database schema and initialization
├── requirements.txt # Python dependencies
├── Dockerfile       # Container configuration
├── docker-compose.yml # Multi-service setup
└── README.md        # This file

Adding New Providers

Add provider configuration to PROVIDERS dict in main.py
Implement routing logic in the gateway endpoint
Update model routing conditions
Add environment variable for API key

Contributing

Fork the repository
Create a feature branch
Make your changes
Add tests if applicable
Submit a pull request

License

MIT License

Support

For issues and questions, please open a GitHub issue or contact the maintainers.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM Gateway

Features

Supported Providers & Models

Quick Start

Prerequisites

1. Clone and Setup

2. Environment Configuration

3. Launch with Docker

Local Development

Install Dependencies

Run Locally

API Usage

Authentication

Request Format

Model Routing

Architecture

Database Schema

Users Table

Request Logs Table

Configuration

Environment Variables

Cache Configuration

Development

Project Structure

Adding New Providers

Contributing

License

Support

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
database.py		database.py
docker-compose.yml		docker-compose.yml
main.py		main.py
middleware.py		middleware.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

LLM Gateway

Features

Supported Providers & Models

Quick Start

Prerequisites

1. Clone and Setup

2. Environment Configuration

3. Launch with Docker

Local Development

Install Dependencies

Run Locally

API Usage

Authentication

Request Format

Model Routing

Architecture

Database Schema

Users Table

Request Logs Table

Configuration

Environment Variables

Cache Configuration

Development

Project Structure

Adding New Providers

Contributing

License

Support

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages