TokenMeter exposes two categories of endpoints:
- Proxy Endpoints (
/v1/...) — OpenAI-compatible API that forwards to LLM providers - Dashboard Endpoints (
/api/...) — Management APIs for the dashboard
All endpoints require authentication via Bearer token (API key).
Base URL: https://proxy.tokenmeter.dev (or http://localhost:8000 for local dev)
All requests must include an API key:
Authorization: Bearer tm_your_api_key_here
OpenAI-compatible chat completions. Supports streaming (SSE).
Request Body:
{
"model": "gpt-4.1",
"messages": [
{"role": "system", "content": "You are helpful."},
{"role": "user", "content": "Hello!"}
],
"temperature": 0.7,
"max_tokens": 1000,
"stream": false,
"tools": [],
"response_format": {"type": "text"}
}TokenMeter Extension Headers:
| Header | Type | Description |
|---|---|---|
X-TM-Team |
string | Team tag for cost attribution |
X-TM-Feature |
string | Feature tag for cost attribution |
X-TM-Routing-Mode |
string | Override routing: cost-optimized, latency-optimized, quality-optimized |
Response (non-streaming):
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1707350400,
"model": "gpt-4.1",
"choices": [
{
"index": 0,
"message": {"role": "assistant", "content": "Hello! How can I help?"},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 15,
"completion_tokens": 8,
"total_tokens": 23
},
"tm_provider": "openai",
"tm_cost_usd": 0.0000344,
"tm_latency_ms": 245.3,
"tm_cached": false,
"tm_routed_from": null
}Response (streaming):
data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","model":"gpt-4.1","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}
data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","model":"gpt-4.1","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}
data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","model":"gpt-4.1","choices":[{"index":0,"delta":{"content":"!"},"finish_reason":"stop"}],"usage":{"prompt_tokens":15,"completion_tokens":2,"total_tokens":17}}
data: [DONE]
OpenAI-compatible embeddings endpoint.
Request:
{
"model": "text-embedding-3-small",
"input": "Hello world"
}Response:
{
"object": "list",
"data": [
{
"object": "embedding",
"embedding": [0.0023, -0.009, ...],
"index": 0
}
],
"model": "text-embedding-3-small",
"usage": {"prompt_tokens": 2, "total_tokens": 2},
"tm_provider": "openai",
"tm_cost_usd": 0.00000004,
"tm_latency_ms": 89.2
}List available models.
Complete dashboard data with all analytics.
Query Parameters:
| Param | Default | Description |
|---|---|---|
period |
24h |
Time period: 1h, 24h, 7d, 30d, 90d |
team |
— | Filter by team |
feature |
— | Filter by feature |
model |
— | Filter by model |
granularity |
hour |
minute, hour, day, week |
Time-series cost data.
Cost breakdown by model.
Cost breakdown by team.
List all budgets.
Create a budget.
{
"name": "Monthly AI Spend",
"amount_usd": 500.0,
"period": "monthly",
"team": null,
"hard_limit": false,
"thresholds": [
{"percentage": 50, "severity": "info", "channels": ["slack"]},
{"percentage": 80, "severity": "warning", "channels": ["slack"]},
{"percentage": 100, "severity": "critical", "channels": ["slack", "email"]}
]
}Get budget status with utilization and projected spend.
Update a budget.
Delete a budget.
Get routing configuration.
Update routing mode, aliases, fallbacks.
{
"mode": "cost-optimized",
"enabled": true,
"model_aliases": {"gpt-4": "gpt-4.1"},
"fallback_models": {"gpt-5.2": ["gpt-5", "gpt-4.1"]}
}Create a custom routing rule.
List all available models with pricing.
List API keys (masked).
Create a new API key. Returns the full key once.
Revoke an API key.
All errors follow the OpenAI error format:
{
"error": {
"message": "Invalid API key",
"type": "authentication_error",
"code": "invalid_api_key"
}
}| Status | Type | Description |
|---|---|---|
| 401 | authentication_error |
Invalid or missing API key |
| 429 | rate_limit_error |
Rate limit exceeded |
| 429 | budget_exceeded |
Hard budget limit reached |
| 502 | proxy_error |
Upstream provider error |
| 503 | provider_unavailable |
No provider configured for model |