A lightweight control layer that detects ambiguous or underspecified requests before invoking LLM, RAG and agent workflows.
The Observability Controller evaluates whether a request contains sufficient information to proceed with reasoning, retrieval, planning or execution. When critical information is missing, the controller can request clarification before downstream systems consume compute, retrieve documents, call tools or generate responses.
Many AI systems begin reasoning before they have enough information.
Example:
My deployment failed.
Despite having almost no useful context, a model may immediately begin troubleshooting, retrieving documents, generating plans or calling tools.
This can lead to:
- Generic or inaccurate responses
- Incorrect task decomposition
- Unnecessary retrieval operations
- Wasted agent execution
- Agent drift
- Repeated reasoning cycles
- Increased model consumption
The Observability Controller attempts to identify missing information before reasoning begins.
User Query
↓
Observability Controller
↓
Clarify or Proceed
↓
LLM / Agent / RAG System
If information is missing:
{
"decision": "clarify",
"state": "underspecified"
}If sufficient information exists:
{
"decision": "proceed",
"state": "ready"
}Input:
The system is broken.
Output:
{
"decision": "clarify",
"state": "underspecified"
}Example clarification:
Which system is affected and what behaviour are you observing?
Input:
The Kubernetes deployment entered CrashLoopBackOff after upgrading from v1.31 to v1.32.
Output:
{
"decision": "proceed",
"state": "ready"
}User
↓
LLM / Agent
↓
Reasoning begins immediately
User
↓
Observability Controller
↓
Clarify or Proceed
↓
LLM / Agent
Without clarification:
User
↓
Planner Agent
↓
Research Agent
↓
Execution Agent
With clarification:
User
↓
Observability Controller
↓
Clarification
↓
Planner Agent
↓
Research Agent
↓
Execution Agent
The controller is designed to reduce ambiguity before downstream workflows begin planning, retrieval, execution or reasoning.
POST /observe{
"message": "My deployment failed.",
"mode": "sufficiency_v2"
}Content-Type: application/json
x-api-key: YOUR_API_KEY
Current evaluation includes:
- 24 multi-step workflow executions
- 45 held-out sufficiency classification tests
- 16 adversarial boundary cases
The controller was evaluated across 24 multi-step workflow executions involving planner, researcher, analyst and writer stages.
Results:
Workflow executions: 24
Completed successfully: 22
Stopped as underspecified: 2
Repairs triggered: 7
In 7 cases, ambiguity or degraded intermediate outputs were identified and corrected during evaluation.
The controller prevented execution of workflows that remained underspecified and allowed sufficiently specified workflows to proceed through the execution chain.
Held-out benchmark:
45 / 45 classifications correct
Adversarial boundary benchmark:
12 / 16 classifications correct
75% agreement
The adversarial benchmark contains intentionally ambiguous requests designed to test the controller near decision boundaries.
These results are exploratory beta findings intended to evaluate clarification-first workflow control rather than establish final performance claims.
In addition to request-level clarification, the controller has been evaluated experimentally within multi-stage workflows involving planner, researcher, analyst and writer stages.
Early evaluation explored:
- Intermediate handoff validation
- Ambiguity detection between workflow stages
- Repair and revalidation of degraded intermediate outputs
- Workflow stopping when sufficient context could not be recovered
These capabilities remain experimental and are not currently exposed through the public API.
The controller is model agnostic and can be integrated ahead of:
- OpenAI workflows
- Claude workflows
- LangGraph pipelines
- CrewAI systems
- AutoGen agents
- Internal support assistants
- Operational triage systems
- Customer support workflows
- Custom RAG systems
- Internal AI copilots
Typical use cases include:
- Incident triage
- Support ticket routing
- AI agent workflows
- Retrieval augmented generation (RAG)
- Internal engineering assistants
- Operational diagnostics
The controller currently supports two clarification workflows.
Returns a predefined clarification question directly.
Additional model tokens: 0
Returns a model agnostic clarification prompt that may be sent to OpenAI, Claude, Gemini, Ollama, Mistral or internal models to generate a context-specific clarification question.
Example:
Input:
"The patient became unwell."
Generated clarification:
"What symptoms is the patient experiencing?"
Typical clarification generation cost:
~40–60 tokens
This is typically much smaller than invoking a full reasoning, retrieval or diagnostic workflow.
This is currently a private beta API.
To request access or provide feedback:
Please include a short description of your use case.
You will receive:
API URL
API key
Never commit API keys into public repositories.
examples/curl_example.sh
examples/python_example.py
examples/openai_gated_example.py
examples/model_agnostic_workflow.py
Supports Claude, Gemini, Ollama, Mistral, internal models and other providers.
Private beta.
Current evaluation focuses on:
- Operational diagnostics
- Support triage
- RAG systems
- Agent workflows
- Internal AI assistants
The objective is to determine whether clarification-first reasoning improves downstream workflow quality, execution efficiency and resource utilisation before reasoning begins.
Areas currently under evaluation include:
- Clarification accuracy
- Agent workflow impact
- Retrieval quality improvements
- Tool call reduction
- Human preference testing
- Workflow repair and revalidation