docs(llmisvc): add config composition deep-dive page#689
Conversation
Add a dedicated page documenting how LLMInferenceService config composition works - the merge/injection machinery that combines well-known configs, user baseRefs, and the service spec into a final effective configuration. The page leads with a concrete example showing what a user writes, what the controller resolves, and the color-coded effective result with field provenance. It then covers configuration sources, merge order, config injection decision logic per deployment topology, strategic merge patch behavior, and namespace resolution. Also removes the now-redundant "Composition Merge Order" section from the configuration page in favor of the new deep-dive. Signed-off-by: Bartosz Majsak <bartosz.majsak@gmail.com>
✅ Deploy Preview for elastic-nobel-0aef7a ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
pierDipi
left a comment
There was a problem hiding this comment.
This is very nice, thanks!
Just one nit
/lgtm
|
maybe we can also link from "Composable Configuration" in here https://deploy-preview-689--elastic-nobel-0aef7a.netlify.app/docs/next/model-serving/generative-inference/llmisvc/llmisvc-overview#summary |
Co-authored-by: Pierangelo Di Pilato <pierangelodipilato@gmail.com> Signed-off-by: Bartosz Majsak <bartosz.majsak@gmail.com>
|
|
||
| # LLMInferenceService Config Composition | ||
|
|
||
| A full LLM inference deployment touches container images, probes, security contexts, scheduler settings, routing rules, resource limits, and more. Most users should not have to care about all of that. KServe ships sensible defaults that cover common deployments out of the box - a working service needs a model URI, a reference to a hardware profile, and a few feature toggles like routing and scheduling. When you do need to change something - a different GPU type, a custom routing policy, a longer startup probe - you override only that field and the defaults you did not touch stay in place. |
There was a problem hiding this comment.
Curious if we could swap all hardware profile references for something more generic like "accelerator config" or something. It is used in the configs in question (example)
| 1. The spec has no `prefill` and no `worker` - single-node deployment. The controller injects `kserve-config-llm-template`. | ||
| 2. The spec has `router.scheduler` defined without an external pool ref - the controller injects `kserve-config-llm-scheduler`. | ||
| 3. The spec has `router.route` defined without external route refs - the controller injects `kserve-config-llm-router-route`. | ||
| 4. The `baseRefs` reference `my-gpu-profile` - the controller fetches it from the `kserve` namespace (not found in `my-team`). |
There was a problem hiding this comment.
I am liking the direction of these docs -- still reading -- but I do wonder if the answer to "why" the controller injects x vs y could not be explained here
|
|
||
| The sources that get merged (well-known configs are auto-injected, baseRef is resolved from the `kserve` namespace): | ||
|
|
||
| <div style={{display: 'flex', gap: '0.5rem', flexWrap: 'wrap', fontSize: '0.8rem', lineHeight: '1.4', marginBottom: '0.5rem'}}> |
| The controller uses Kubernetes `strategicpatch.StrategicMergePatch` to combine configs. Here is how the merge works at the field level: | ||
|
|
||
| - **Non-zero fields** from the override are applied to the base. Existing base fields that the override does not mention are left untouched. | ||
| - **Zero-valued fields** (empty string `""`, `0`, `nil`, `false`) in the override do **not** overwrite base values. This prevents a config that does not specify a port from wiping out the well-known config's port. |
There was a problem hiding this comment.
Not the well known configs, but the baseRefs continue to apply right?
|
|
||
| - **Non-zero fields** from the override are applied to the base. Existing base fields that the override does not mention are left untouched. | ||
| - **Zero-valued fields** (empty string `""`, `0`, `nil`, `false`) in the override do **not** overwrite base values. This prevents a config that does not specify a port from wiping out the well-known config's port. | ||
| - **Container lists** are merged by name. The `main` container from different sources merges into a single `main` container rather than creating duplicates. Other list fields may replace entirely depending on their strategic merge patch annotations. |
There was a problem hiding this comment.
Would it be possible to explain what you mean by non-main containers merging differently?
|
|
||
| 1. **The LLMInferenceService's own namespace** (highest priority) | ||
| 2. **The KServe system namespace** (typically `kserve`) | ||
| 3. If not found in either namespace, the controller sets the `PresetsCombined` condition to `False` with reason `ConfigNotFound` |
There was a problem hiding this comment.
Sorry, perhaps a dumb question on conditions -- will this mark the LLMInferenceService as NOT Ready?

Summary
baseRefs, and the service specMotivation
The existing configuration page covers composition at a conceptual level but doesn't explain the machinery: which well-known configs get injected and why, how the controller decides based on deployment pattern, what fields each config brings, and what the final merged result looks like. This page closes that gap with a visual, example-first approach.
Test plan
/docs/model-serving/generative-inference/llmisvc/llmisvc-config-compositionnpm run buildsucceeds without errors