Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 41 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ application code.
| LMDeploy | Yes | Yes | External node registration | Uses LMDeploy PD connection pool and RDMA migration when available. |
| vLLM | Yes | Yes | Static, heartbeat | Supports two-stage KV transfer and static NIXL DP-aware rank routing. |
| SGLang | Yes | Yes | Static | Uses bootstrap dual dispatch with aligned prefill bootstrap ports. |
| DLEngine | Yes | Yes | dlslime-ctrl (`nanoctrl`) | Hybrid `dlengine serve` nodes; auto-discovery when `--ctrl_address` is set. |

DLRouter is configured with one backend type per router process through
`--backend`. Run multiple router processes if you need separate backend types at
Expand Down Expand Up @@ -100,6 +101,45 @@ curl -X POST http://localhost:8000/v1/chat/completions \
}'
```

### DLEngine with dlslime-ctrl discovery

Start the control plane and a DLEngine OpenAI server (see DLEngine
`dlengine serve`), then run DLRouter with auto-discovery:

```bash
dlslime-ctrl server --redis-url redis://127.0.0.1:6379

dlengine serve /path/to/model \
--host 0.0.0.0 --port 8100 \
--served-model-name Qwen3-4B \
--ctrl-address 127.0.0.1:4479

pip install -e ".[dlengine]" # pulls dlslime for NanoCtrlClient

python -m dlrouter \
--backend dlengine \
--serving_strategy hybrid \
--ctrl_address 127.0.0.1:4479
```

DLRouter polls dlslime-ctrl for entities with kind `dlengine` and registers
their HTTP endpoints. Use the same `model` name as `--served-model-name` in
requests. Manual registration still works via `POST /nodes/add` when
`--ctrl_address` is omitted.

Send a request (the served model name, model path, and its basename are all
accepted as the `model` value):

```bash
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "Qwen3-4B",
"messages": [{"role": "user", "content": "Hello!"}],
"stream": false
}'
```

DLRouter also installs a `dlrouter` console script, so `dlrouter ...` is
equivalent to `python -m dlrouter ...` after installation.

Expand Down Expand Up @@ -226,7 +266,7 @@ be installed in the runtime environment.
|---|---|---|
| `--server_name` | `0.0.0.0` | Bind address. |
| `--server_port` | `8000` | Listen port. |
| `--backend` | `lmdeploy` | Backend type: `lmdeploy`, `vllm`, or `sglang`. |
| `--backend` | `lmdeploy` | Backend type: `lmdeploy`, `vllm`, `sglang`, or `dlengine`. |
| `--routing_strategy` | `min_expected_latency` | Request routing strategy. |
| `--serving_strategy` | `hybrid` | Serving mode: `hybrid` or `distserve`. |
| `--api_keys` | `None` | Comma-separated Bearer tokens for API authentication. |
Expand Down
34 changes: 18 additions & 16 deletions dlrouter/api/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
)
from dlrouter.backends.factory import create_backend
from dlrouter.config import RouterConfig
from dlrouter.constants import ServingStrategy
from dlrouter.constants import ServiceDiscoveryMode, ServingStrategy
from dlrouter.core.health_check import HealthChecker
from dlrouter.core.node_manager import NodeManager
from dlrouter.core.proxy_engine import ProxyEngine
Expand Down Expand Up @@ -108,22 +108,24 @@ async def lifespan(application: FastAPI):
cache_status=config.cache_status,
)

# Service discovery (backend-specific, e.g., ZMQ for vLLM PD mode)
# Service discovery (backend-specific)
service_discovery: Optional[Any] = None
if config.serving_strategy == ServingStrategy.DISTSERVE:
discovery_mode = backend.preferred_discovery_mode(config.backend_config)
if discovery_mode is not None:
service_discovery = backend.create_service_discovery(
discovery_mode,
config.backend_config,
node_manager,
)
# Allow heartbeat-based discovery to drop its registered-address
# cache when a node is removed (e.g. by HealthChecker after a
# crash), so a restarted instance can be re-registered.
unregister = getattr(service_discovery, 'unregister_by_url', None)
if callable(unregister):
node_manager.add_remove_listener(unregister)
discovery_mode = backend.preferred_discovery_mode(config.backend_config)
use_discovery = discovery_mode is not None and (
config.serving_strategy == ServingStrategy.DISTSERVE
or discovery_mode == ServiceDiscoveryMode.NANOCTRL
)
if use_discovery:
service_discovery = backend.create_service_discovery(
discovery_mode,
config.backend_config,
node_manager,
)
# Allow discovery to drop its registered-address cache when a node is
# removed (e.g. by HealthChecker), so a restarted instance can re-register.
unregister = getattr(service_discovery, 'unregister_by_url', None)
if callable(unregister):
node_manager.add_remove_listener(unregister)

# Proxy engine
proxy_engine = ProxyEngine(node_manager)
Expand Down
8 changes: 8 additions & 0 deletions dlrouter/backends/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,11 @@

from dlrouter.backends.base import BaseBackend
from dlrouter.backends.definition import BackendDefinition
from dlrouter.backends.dlengine import (
DLENGINE_BACKEND_DEFINITION,
DLEngineBackend,
DLEngineConfig,
)
from dlrouter.backends.factory import create_backend, get_backend_definition
from dlrouter.backends.lmdeploy import (
LMDEPLOY_BACKEND_DEFINITION,
Expand All @@ -21,11 +26,14 @@


__all__ = [
'DLENGINE_BACKEND_DEFINITION',
'LMDEPLOY_BACKEND_DEFINITION',
'SGLANG_BACKEND_DEFINITION',
'VLLM_BACKEND_DEFINITION',
'BackendDefinition',
'BaseBackend',
'DLEngineBackend',
'DLEngineConfig',
'LMDeployBackend',
'LMDeployPDConfig',
'SGLangBackend',
Expand Down
12 changes: 12 additions & 0 deletions dlrouter/backends/dlengine/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
"""DLEngine backend package."""

from dlrouter.backends.dlengine.backend import DLEngineBackend
from dlrouter.backends.dlengine.config import DLEngineConfig
from dlrouter.backends.dlengine.definition import DLENGINE_BACKEND_DEFINITION


__all__ = [
'DLENGINE_BACKEND_DEFINITION',
'DLEngineBackend',
'DLEngineConfig',
]
Loading
Loading