Skip to content

docs(llmisvc): add config composition deep-dive page#689

Open
bartoszmajsak wants to merge 2 commits into
kserve:mainfrom
bartoszmajsak:docs/llmisvc-config-composition
Open

docs(llmisvc): add config composition deep-dive page#689
bartoszmajsak wants to merge 2 commits into
kserve:mainfrom
bartoszmajsak:docs/llmisvc-config-composition

Conversation

@bartoszmajsak
Copy link
Copy Markdown
Contributor

Summary

  • Adds a new "Config Composition" page documenting how LLMInferenceService builds its effective configuration from well-known configs, user baseRefs, and the service spec
  • Leads with a concrete example: what the user writes, what the controller resolves, and a color-coded merged result showing field provenance
  • Covers configuration sources and merge order, config injection per deployment topology (single-node, multi-node, disaggregated), strategic merge patch behavior, and namespace resolution
  • Removes the redundant "Composition Merge Order" section from the configuration page in favor of the new deep-dive
  • Adds sidebar entry and cross-link from the configuration page

Motivation

The existing configuration page covers composition at a conceptual level but doesn't explain the machinery: which well-known configs get injected and why, how the controller decides based on deployment pattern, what fields each config brings, and what the final merged result looks like. This page closes that gap with a visual, example-first approach.

Test plan

  • Verify page renders at /docs/model-serving/generative-inference/llmisvc/llmisvc-config-composition
  • Verify Mermaid diagram renders with correct color coding
  • Verify colored MDX panels and result block display properly
  • Verify sidebar entry appears after "Configuration"
  • Verify cross-link from configuration page works
  • Verify npm run build succeeds without errors

Add a dedicated page documenting how LLMInferenceService config
composition works - the merge/injection machinery that combines
well-known configs, user baseRefs, and the service spec into a
final effective configuration.

The page leads with a concrete example showing what a user writes,
what the controller resolves, and the color-coded effective result
with field provenance. It then covers configuration sources, merge
order, config injection decision logic per deployment topology,
strategic merge patch behavior, and namespace resolution.

Also removes the now-redundant "Composition Merge Order" section
from the configuration page in favor of the new deep-dive.

Signed-off-by: Bartosz Majsak <bartosz.majsak@gmail.com>
@netlify
Copy link
Copy Markdown

netlify Bot commented Jun 3, 2026

Deploy Preview for elastic-nobel-0aef7a ready!

Name Link
🔨 Latest commit d5cc77b
🔍 Latest deploy log https://app.netlify.com/projects/elastic-nobel-0aef7a/deploys/6a21452d8cef3f00085ecd0e
😎 Deploy Preview https://deploy-preview-689--elastic-nobel-0aef7a.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Comment thread docs/model-serving/generative-inference/llmisvc/llmisvc-config-composition.md Outdated
Copy link
Copy Markdown
Member

@pierDipi pierDipi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very nice, thanks!

Just one nit

/lgtm

@pierDipi
Copy link
Copy Markdown
Member

pierDipi commented Jun 4, 2026

maybe we can also link from "Composable Configuration" in here https://deploy-preview-689--elastic-nobel-0aef7a.netlify.app/docs/next/model-serving/generative-inference/llmisvc/llmisvc-overview#summary

Co-authored-by: Pierangelo Di Pilato <pierangelodipilato@gmail.com>
Signed-off-by: Bartosz Majsak <bartosz.majsak@gmail.com>

# LLMInferenceService Config Composition

A full LLM inference deployment touches container images, probes, security contexts, scheduler settings, routing rules, resource limits, and more. Most users should not have to care about all of that. KServe ships sensible defaults that cover common deployments out of the box - a working service needs a model URI, a reference to a hardware profile, and a few feature toggles like routing and scheduling. When you do need to change something - a different GPU type, a custom routing policy, a longer startup probe - you override only that field and the defaults you did not touch stay in place.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious if we could swap all hardware profile references for something more generic like "accelerator config" or something. It is used in the configs in question (example)

Comment on lines +82 to +85
1. The spec has no `prefill` and no `worker` - single-node deployment. The controller injects `kserve-config-llm-template`.
2. The spec has `router.scheduler` defined without an external pool ref - the controller injects `kserve-config-llm-scheduler`.
3. The spec has `router.route` defined without external route refs - the controller injects `kserve-config-llm-router-route`.
4. The `baseRefs` reference `my-gpu-profile` - the controller fetches it from the `kserve` namespace (not found in `my-team`).
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am liking the direction of these docs -- still reading -- but I do wonder if the answer to "why" the controller injects x vs y could not be explained here


The sources that get merged (well-known configs are auto-injected, baseRef is resolved from the `kserve` namespace):

<div style={{display: 'flex', gap: '0.5rem', flexWrap: 'wrap', fontSize: '0.8rem', lineHeight: '1.4', marginBottom: '0.5rem'}}>
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, you run into a dark screen problem here:

Image

The controller uses Kubernetes `strategicpatch.StrategicMergePatch` to combine configs. Here is how the merge works at the field level:

- **Non-zero fields** from the override are applied to the base. Existing base fields that the override does not mention are left untouched.
- **Zero-valued fields** (empty string `""`, `0`, `nil`, `false`) in the override do **not** overwrite base values. This prevents a config that does not specify a port from wiping out the well-known config's port.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not the well known configs, but the baseRefs continue to apply right?


- **Non-zero fields** from the override are applied to the base. Existing base fields that the override does not mention are left untouched.
- **Zero-valued fields** (empty string `""`, `0`, `nil`, `false`) in the override do **not** overwrite base values. This prevents a config that does not specify a port from wiping out the well-known config's port.
- **Container lists** are merged by name. The `main` container from different sources merges into a single `main` container rather than creating duplicates. Other list fields may replace entirely depending on their strategic merge patch annotations.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to explain what you mean by non-main containers merging differently?


1. **The LLMInferenceService's own namespace** (highest priority)
2. **The KServe system namespace** (typically `kserve`)
3. If not found in either namespace, the controller sets the `PresetsCombined` condition to `False` with reason `ConfigNotFound`
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, perhaps a dumb question on conditions -- will this mark the LLMInferenceService as NOT Ready?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants