Skip to content

Moving things into LMI#446

Open
sidnarayanan wants to merge 41 commits into
mainfrom
lmi-fallbacks
Open

Moving things into LMI#446
sidnarayanan wants to merge 41 commits into
mainfrom
lmi-fallbacks

Conversation

@sidnarayanan

@sidnarayanan sidnarayanan commented Apr 18, 2026

Copy link
Copy Markdown
Collaborator

Bringing some functionality up from LiteLLM into LMI:

  • Model configuration uses opinionated pydantic models (ModelSpec, LLMConfig), which get translated into litellm config dicts just in time
  • LMI handles retries and fallbacks. We use slightly more intelligent retries - only on things that are actually worth retrying, and with jittered exp backoff
  • Because we own retries/fallbacks, we have better logging
  • ModelSpec.responses_api lets us toggle-on Responses backend selectively. This means we can swap between the backends in a fallback cascade
  • Response validation is pushed down from LDP into LMI, so it can use the new LMI retry/fallback logic.

Notably, this drops all dependence on litellm.Router except for embeddings, which we can likely handle in a similar way. It also gets rid of a few things that annoyed me:

  • Having to remember whether parameters go in model_list or top-level kwargs
  • Differently-shaped configuration at SimpleAgent/LLMCallOp/LiteLLMModel/litellm levels. Now, the first 3 all operate on LLMConfig.

Some proof that this is backwards compatible: no cassette churn besides from new/dropped tests.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR moves LLM configuration and execution concerns “up” into LMI by introducing typed, opinionated config models (ModelSpec, LLMConfig), shifting retry/fallback behavior into LMI (with clearer error semantics), and updating LDP agents/modules/tests to consume the new config shape.

Changes:

  • Replace dict-shaped llm_model configuration with typed llm_config (LLMConfig / ModelSpec) across agents, graph modules, and tests.
  • Introduce LMI-owned retry/fallback policy and new error surfaces (e.g., AllModelsExhaustedError, ModelRefusalError) with targeted unit tests.
  • Update VCR cassettes to reflect new default request parameters (notably max_tokens=4096) and updated client/library versions.

Reviewed changes

Copilot reviewed 52 out of 54 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/test_rollouts.py Updates fallback behavior assertions to expect AllModelsExhaustedError and uses llm_config legacy-coerce shape.
tests/test_optimizer.py Migrates optimizer tests to LLMConfig.coerce() and llm_config= construction.
tests/test_ops.py Switches op tests to LLMConfig inputs and updates gradient expectations when config becomes a scalar leaf.
tests/test_modules.py Updates module configs/instantiation to use llm_config defaults from ReActAgent.
tests/test_envs.py Updates agent construction to llm_config and error expectations to AllModelsExhaustedError.
tests/test_agents.py Migrates agents to llm_config, updates serialization assertions and gradient assertions.
tests/cassettes/TestSimpleAgent.test_dummyenv[gpt-4o-mini-2024-07-18].yaml VCR update for new request shape (adds max_tokens) and updated client headers.
tests/cassettes/TestSimpleAgent.test_dummyenv[claude-haiku-4-5-20251001].yaml VCR update for new request shape (max_tokens, tool schema changes).
tests/cassettes/TestSimpleAgent.test_agent_grad[gpt-4o-mini-2024-07-18].yaml VCR update for new request shape (adds max_tokens) and updated client headers.
tests/cassettes/TestSimpleAgent.test_agent_grad[claude-haiku-4-5-20251001].yaml VCR update for new request shape (max_tokens, tool schema changes).
tests/cassettes/TestReActAgent.test_react_dummyenv[True-gpt-4-turbo].yaml VCR update for new request shape (adds max_tokens).
tests/cassettes/TestReActAgent.test_react_dummyenv[True-claude-haiku-4-5-20251001].yaml VCR update for new request shape (max_tokens) and payload adjustments.
tests/cassettes/TestReActAgent.test_react_dummyenv[False-claude-haiku-4-5-20251001].yaml VCR update for new request shape (max_tokens) and payload adjustments.
tests/cassettes/TestReActAgent.test_agent_grad[True-gpt-4-turbo].yaml VCR update for new request shape (adds max_tokens).
tests/cassettes/TestReActAgent.test_agent_grad[True-claude-haiku-4-5-20251001].yaml VCR update for new request shape (max_tokens) and payload adjustments.
tests/cassettes/TestReActAgent.test_agent_grad[False-claude-haiku-4-5-20251001].yaml VCR update for new request shape (max_tokens) and payload adjustments.
tests/cassettes/TestNoToolsSimpleAgent.test_dummyenv[claude-haiku-4-5-20251001].yaml VCR update for new request shape (max_tokens).
tests/cassettes/TestMemoryAgent.test_agent_grad.yaml VCR update for new request shape (adds max_tokens).
tests/cassettes/TestAgentState.test_no_state_mutation[agent1].yaml VCR update for new request shape (adds max_tokens).
tests/cassettes/TestAgentState.test_no_state_mutation[agent0].yaml VCR update for new request shape (adds max_tokens).
src/ldp/graph/modules/thought.py Switches module to accept LLMConfig instead of dict config.
src/ldp/graph/modules/reflect.py Replaces dict llm_model field with LLMConfigField and updates wiring.
src/ldp/graph/modules/react.py Uses LLMConfig.with_extra_params() to set stop sequences without mutating dicts.
src/ldp/graph/modules/llm_call.py Updates parsed-call module to use LLMConfig and typed ConfigOp.
src/ldp/graph/common_ops.py Changes LLMCallOp.forward() signature to accept LLMConfig and constructs LiteLLMModel via llm_config.
src/ldp/agent/tree_of_thoughts_agent.py Migrates agent config field to LLMConfigField and updates call sites.
src/ldp/agent/simple_agent.py Migrates agent config field to LLMConfigField and updates internal ConfigOp.
src/ldp/agent/react_agent.py Migrates agent config field to LLMConfigField and updates module construction.
packages/lmi/tests/test_retry.py Adds unit tests for retry/fallback classification and backoff bounds.
packages/lmi/tests/test_llms.py Updates tests for new fallback error semantics, dispatch behavior, and Responses integration toggled per-model.
packages/lmi/tests/test_litellm_patches.py Removes tests for the removed provider-400 retry patch.
packages/lmi/tests/test_dispatch.py Adds end-to-end tests for LMI dispatch + retry/fallback loop (mocking litellm.acompletion).
packages/lmi/tests/test_cost_tracking.py Simplifies tests by removing Router bypass paths and aligning to new config behavior.
packages/lmi/tests/test_config.py Adds comprehensive tests for ModelSpec, legacy config translation, and LLMConfig.coerce / LLMConfigField.
packages/lmi/tests/cassettes/TestResponsesAPIIntegration.test_basic_call.yaml Updates Responses API VCR cassette.
packages/lmi/tests/cassettes/TestResponsesAPIIntegration.test_multi_turn_stateful.yaml Updates Responses API multi-turn VCR cassette.
packages/lmi/tests/cassettes/TestResponsesAPIIntegration.test_responses_api_off_ignores_response_id.yaml Updates VCR cassette for non-Responses model behavior.
packages/lmi/tests/cassettes/TestLiteLLMModel.test_max_token_truncation.yaml Adds/updates VCR cassette for truncation behavior with max_tokens.
packages/lmi/tests/cassettes/TestLiteLLMModel.test_cost_call_single.yaml Adds/updates VCR cassette for streaming cost tracking request shape.
packages/lmi/src/lmi/retry.py Introduces centralized retry/fallback policy and jittered exponential backoff.
packages/lmi/src/lmi/litellm_patches.py Removes the provider-400 retry patch and renumbers patch docs accordingly.
packages/lmi/src/lmi/exceptions.py Adds ModelRefusalError and AllModelsExhaustedError.
packages/lmi/src/lmi/constants.py Removes env-flag toggle for Responses API and keeps core constants.
packages/lmi/src/lmi/config.py Adds typed ModelSpec/LLMConfig and legacy/dict coercion utilities.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread packages/lmi/src/lmi/config.py Outdated
Comment thread packages/lmi/src/lmi/config.py Outdated
Comment thread packages/lmi/src/lmi/config.py Outdated
Comment thread packages/lmi/src/lmi/config.py Outdated
Comment thread packages/lmi/src/lmi/config.py
Comment thread packages/lmi/src/lmi/config.py Outdated
Comment thread packages/lmi/src/lmi/config.py Outdated
Comment thread packages/lmi/src/lmi/config.py Outdated
Comment thread packages/lmi/src/lmi/config.py Outdated
Comment thread packages/lmi/src/lmi/constants.py
Comment thread packages/lmi/pyproject.toml Outdated
@agolajko agolajko requested a review from jamesbraza June 3, 2026 22:14
Comment thread packages/lmi/src/lmi/config.py Outdated
Comment thread packages/lmi/src/lmi/config.py Outdated
Comment thread packages/lmi/src/lmi/config.py Outdated
Comment thread packages/lmi/src/lmi/config.py Outdated
Comment thread packages/lmi/src/lmi/retry.py Outdated
Comment thread packages/lmi/src/lmi/retry.py Outdated
Comment thread packages/lmi/src/lmi/retry.py Outdated
Comment thread packages/lmi/src/lmi/retry.py Outdated
Comment thread packages/lmi/src/lmi/retry.py Outdated
Comment thread packages/lmi/src/lmi/llms.py Outdated

@jamesbraza jamesbraza left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huge refactor, wow. Nice work! Looking forward to using LLMConfig

Comment thread packages/lmi/src/lmi/config.py Outdated
Comment thread packages/lmi/src/lmi/config.py Outdated
Comment thread packages/lmi/src/lmi/config.py Outdated
Comment thread packages/lmi/src/lmi/litellm_patches.py
Comment thread packages/lmi/src/lmi/llms.py Outdated
Comment thread packages/lmi/src/lmi/llms.py Outdated
Comment thread packages/lmi/src/lmi/llms.py
Comment thread packages/lmi/src/lmi/retry.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants