docs(llmisvc): add config composition deep-dive page by bartoszmajsak · Pull Request #689 · kserve/website

bartoszmajsak · 2026-06-03T21:29:26Z

Summary

Adds a new "Config Composition" page documenting how LLMInferenceService builds its effective configuration from well-known configs, user baseRefs, and the service spec
Leads with a concrete example: what the user writes, what the controller resolves, and a color-coded merged result showing field provenance
Covers configuration sources and merge order, config injection per deployment topology (single-node, multi-node, disaggregated), strategic merge patch behavior, and namespace resolution
Removes the redundant "Composition Merge Order" section from the configuration page in favor of the new deep-dive
Adds sidebar entry and cross-link from the configuration page

Motivation

The existing configuration page covers composition at a conceptual level but doesn't explain the machinery: which well-known configs get injected and why, how the controller decides based on deployment pattern, what fields each config brings, and what the final merged result looks like. This page closes that gap with a visual, example-first approach.

Test plan

Verify page renders at /docs/model-serving/generative-inference/llmisvc/llmisvc-config-composition
Verify Mermaid diagram renders with correct color coding
Verify colored MDX panels and result block display properly
Verify sidebar entry appears after "Configuration"
Verify cross-link from configuration page works
Verify npm run build succeeds without errors

Add a dedicated page documenting how LLMInferenceService config composition works - the merge/injection machinery that combines well-known configs, user baseRefs, and the service spec into a final effective configuration. The page leads with a concrete example showing what a user writes, what the controller resolves, and the color-coded effective result with field provenance. It then covers configuration sources, merge order, config injection decision logic per deployment topology, strategic merge patch behavior, and namespace resolution. Also removes the now-redundant "Composition Merge Order" section from the configuration page in favor of the new deep-dive. Signed-off-by: Bartosz Majsak <bartosz.majsak@gmail.com>

netlify · 2026-06-03T21:29:32Z

✅ Deploy Preview for elastic-nobel-0aef7a ready!

Name	Link
🔨 Latest commit	`d5cc77b`
🔍 Latest deploy log	https://app.netlify.com/projects/elastic-nobel-0aef7a/deploys/6a21452d8cef3f00085ecd0e
😎 Deploy Preview	https://deploy-preview-689--elastic-nobel-0aef7a.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

pierDipi

This is very nice, thanks!

Just one nit

/lgtm

pierDipi · 2026-06-04T07:50:40Z

maybe we can also link from "Composable Configuration" in here https://deploy-preview-689--elastic-nobel-0aef7a.netlify.app/docs/next/model-serving/generative-inference/llmisvc/llmisvc-overview#summary

Co-authored-by: Pierangelo Di Pilato <pierangelodipilato@gmail.com> Signed-off-by: Bartosz Majsak <bartosz.majsak@gmail.com>

andrewballantyne · 2026-06-04T12:44:09Z

+
+# LLMInferenceService Config Composition
+
+A full LLM inference deployment touches container images, probes, security contexts, scheduler settings, routing rules, resource limits, and more. Most users should not have to care about all of that. KServe ships sensible defaults that cover common deployments out of the box - a working service needs a model URI, a reference to a hardware profile, and a few feature toggles like routing and scheduling. When you do need to change something - a different GPU type, a custom routing policy, a longer startup probe - you override only that field and the defaults you did not touch stay in place.


Curious if we could swap all hardware profile references for something more generic like "accelerator config" or something. It is used in the configs in question (example)

andrewballantyne · 2026-06-04T12:48:02Z

+1. The spec has no `prefill` and no `worker` - single-node deployment. The controller injects `kserve-config-llm-template`.
+2. The spec has `router.scheduler` defined without an external pool ref - the controller injects `kserve-config-llm-scheduler`.
+3. The spec has `router.route` defined without external route refs - the controller injects `kserve-config-llm-router-route`.
+4. The `baseRefs` reference `my-gpu-profile` - the controller fetches it from the `kserve` namespace (not found in `my-team`).


I am liking the direction of these docs -- still reading -- but I do wonder if the answer to "why" the controller injects x vs y could not be explained here

andrewballantyne · 2026-06-04T12:49:09Z

+
+The sources that get merged (well-known configs are auto-injected, baseRef is resolved from the `kserve` namespace):
+
+<div style={{display: 'flex', gap: '0.5rem', flexWrap: 'wrap', fontSize: '0.8rem', lineHeight: '1.4', marginBottom: '0.5rem'}}>


FWIW, you run into a dark screen problem here:

andrewballantyne · 2026-06-04T13:07:09Z

+The controller uses Kubernetes `strategicpatch.StrategicMergePatch` to combine configs. Here is how the merge works at the field level:
+
+- **Non-zero fields** from the override are applied to the base. Existing base fields that the override does not mention are left untouched.
+- **Zero-valued fields** (empty string `""`, `0`, `nil`, `false`) in the override do **not** overwrite base values. This prevents a config that does not specify a port from wiping out the well-known config's port.


Not the well known configs, but the baseRefs continue to apply right?

andrewballantyne · 2026-06-04T13:16:23Z

+
+- **Non-zero fields** from the override are applied to the base. Existing base fields that the override does not mention are left untouched.
+- **Zero-valued fields** (empty string `""`, `0`, `nil`, `false`) in the override do **not** overwrite base values. This prevents a config that does not specify a port from wiping out the well-known config's port.
+- **Container lists** are merged by name. The `main` container from different sources merges into a single `main` container rather than creating duplicates. Other list fields may replace entirely depending on their strategic merge patch annotations.


Would it be possible to explain what you mean by non-main containers merging differently?

andrewballantyne · 2026-06-04T13:18:01Z

+
+1. **The LLMInferenceService's own namespace** (highest priority)
+2. **The KServe system namespace** (typically `kserve`)
+3. If not found in either namespace, the controller sets the `PresetsCombined` condition to `False` with reason `ConfigNotFound`


Sorry, perhaps a dumb question on conditions -- will this mark the LLMInferenceService as NOT Ready?

pierDipi reviewed Jun 4, 2026

View reviewed changes

Comment thread docs/model-serving/generative-inference/llmisvc/llmisvc-config-composition.md Outdated

pierDipi reviewed Jun 4, 2026

View reviewed changes

chore: adds link to k8s strategic merge patch

d5cc77b

Co-authored-by: Pierangelo Di Pilato <pierangelodipilato@gmail.com> Signed-off-by: Bartosz Majsak <bartosz.majsak@gmail.com>

andrewballantyne reviewed Jun 4, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs(llmisvc): add config composition deep-dive page#689

docs(llmisvc): add config composition deep-dive page#689
bartoszmajsak wants to merge 2 commits into
kserve:mainfrom
bartoszmajsak:docs/llmisvc-config-composition

bartoszmajsak commented Jun 3, 2026

Uh oh!

netlify Bot commented Jun 3, 2026 •

edited

Loading

Uh oh!

Uh oh!

pierDipi left a comment

Uh oh!

pierDipi commented Jun 4, 2026

Uh oh!

andrewballantyne Jun 4, 2026

Uh oh!

andrewballantyne Jun 4, 2026

Uh oh!

andrewballantyne Jun 4, 2026

Uh oh!

andrewballantyne Jun 4, 2026

Uh oh!

andrewballantyne Jun 4, 2026

Uh oh!

andrewballantyne Jun 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants


		# LLMInferenceService Config Composition

		A full LLM inference deployment touches container images, probes, security contexts, scheduler settings, routing rules, resource limits, and more. Most users should not have to care about all of that. KServe ships sensible defaults that cover common deployments out of the box - a working service needs a model URI, a reference to a hardware profile, and a few feature toggles like routing and scheduling. When you do need to change something - a different GPU type, a custom routing policy, a longer startup probe - you override only that field and the defaults you did not touch stay in place.


		The sources that get merged (well-known configs are auto-injected, baseRef is resolved from the `kserve` namespace):

		<div style={{display: 'flex', gap: '0.5rem', flexWrap: 'wrap', fontSize: '0.8rem', lineHeight: '1.4', marginBottom: '0.5rem'}}>

Conversation

bartoszmajsak commented Jun 3, 2026

Summary

Motivation

Test plan

Uh oh!

netlify Bot commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for elastic-nobel-0aef7a ready!

Uh oh!

Uh oh!

pierDipi left a comment

Choose a reason for hiding this comment

Uh oh!

pierDipi commented Jun 4, 2026

Uh oh!

andrewballantyne Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

andrewballantyne Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

andrewballantyne Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

andrewballantyne Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

andrewballantyne Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

andrewballantyne Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

netlify Bot commented Jun 3, 2026 •

edited

Loading