Implement Structured Output with Pydantic Model#12
Conversation
There was a problem hiding this comment.
Pull Request Overview
This PR implements structured output support for the ChatGradient model using Pydantic models, enabling the model to return validated, type-safe responses conforming to predefined schemas.
Key Changes:
- Added
with_structured_output()method to ChatGradient that accepts Pydantic models, TypedDict, or JSON schemas - Implemented JSON extraction and parsing utilities with error handling
- Added comprehensive unit tests covering basic Pydantic output, validation errors, and raw mode
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| langchain_gradient/chat_models.py | Implemented with_structured_output() method and helper functions for parsing JSON from model responses |
| tests/unit_tests/test_structured_output.py | Added unit tests for structured output functionality with Pydantic models |
| docs/structured_output_examples.md | Added documentation and usage examples for structured output feature |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| content = message.content if hasattr(message, 'content') else str(message) | ||
|
|
||
| # Try to extract JSON from code blocks (``````) | ||
| pattern = r"``````" |
There was a problem hiding this comment.
The regex pattern r\"``````\" will only match exactly 6 backticks in a row. Based on the test mock responses which use triple backticks (), this pattern should be `r\"(.+?)```"` to correctly extract JSON from markdown code blocks.
| pattern = r"``````" | |
| pattern = r"```(.+?)```" |
| prompt = ChatPromptTemplate.from_messages([ | ||
| ("system", | ||
| "You are a helpful assistant that outputs valid JSON matching this schema:\n" | ||
| "``````\n" |
There was a problem hiding this comment.
The prompt template contains 6 backticks but the regex pattern expects 6 backticks exactly. This creates an inconsistency. The template should use triple backticks (```) to match standard markdown code block syntax, or the regex pattern in _extract_and_parse_json needs to be updated accordingly.
| ("system", | ||
| "You are a helpful assistant that outputs valid JSON matching the schema below.\n" | ||
| "{format_instructions}\n" | ||
| "Always wrap your JSON output in `````` tags."), |
There was a problem hiding this comment.
The instruction says 'wrap your JSON output in `````` tags' but standard markdown code blocks use triple backticks (```). This mismatch will cause parsing to fail. Change to triple backticks or update the regex pattern to match 6 backticks.
| "Always wrap your JSON output in `````` tags."), | |
| "Always wrap your JSON output in ``` tags."), |
| else: | ||
| print(f"Parsed: {result['parsed']}") | ||
|
|
||
| undefined |
There was a problem hiding this comment.
The file ends with 'undefined' which appears to be an unintended placeholder or error. This line should be removed.
| undefined |
| from pydantic import BaseModel, Field | ||
| from langchain_gradient import ChatGradient | ||
|
|
||
| class Person(BaseModel): | ||
| name: str = Field(description="The person's name") | ||
| age: int = Field(description="The person's age") | ||
| email: str = Field(description="Email address") | ||
|
|
||
| llm = ChatGradient(model="llama3.3-70b-instruct") | ||
| structured_llm = llm.with_structured_output(Person) | ||
|
|
||
| response = structured_llm.invoke( | ||
| "Create a person named John, age 30, email john@example.com" | ||
| ) | ||
| print(response) | ||
|
|
||
| Output: Person(name='John', age=30, email='john@example.com') | ||
|
|
||
| ## Multiple Objects | ||
|
|
||
| from typing import List | ||
|
|
||
| class PersonList(BaseModel): | ||
| people: List[Person] | ||
|
|
||
| structured_llm = llm.with_structured_output(PersonList) | ||
| response = structured_llm.invoke("Create 3 people with different names") | ||
| print(response.people) # List of Person objects | ||
|
|
||
| ## Error Handling | ||
|
|
||
| Get raw output and errors | ||
| structured_llm = llm.with_structured_output(Person, include_raw=True) | ||
|
|
||
| result = structured_llm.invoke("Create a person") | ||
|
|
||
| if result["parsing_error"]: | ||
| print(f"Error: {result['parsing_error']}") | ||
| print(f"Raw: {result['raw'].content}") | ||
| else: | ||
| print(f"Parsed: {result['parsed']}") |
There was a problem hiding this comment.
The code block is missing proper markdown formatting. It should be wrapped in triple backticks with the language identifier (```python) to render correctly as a code block.
| from pydantic import BaseModel, Field | |
| from langchain_gradient import ChatGradient | |
| class Person(BaseModel): | |
| name: str = Field(description="The person's name") | |
| age: int = Field(description="The person's age") | |
| email: str = Field(description="Email address") | |
| llm = ChatGradient(model="llama3.3-70b-instruct") | |
| structured_llm = llm.with_structured_output(Person) | |
| response = structured_llm.invoke( | |
| "Create a person named John, age 30, email john@example.com" | |
| ) | |
| print(response) | |
| Output: Person(name='John', age=30, email='john@example.com') | |
| ## Multiple Objects | |
| from typing import List | |
| class PersonList(BaseModel): | |
| people: List[Person] | |
| structured_llm = llm.with_structured_output(PersonList) | |
| response = structured_llm.invoke("Create 3 people with different names") | |
| print(response.people) # List of Person objects | |
| ## Error Handling | |
| Get raw output and errors | |
| structured_llm = llm.with_structured_output(Person, include_raw=True) | |
| result = structured_llm.invoke("Create a person") | |
| if result["parsing_error"]: | |
| print(f"Error: {result['parsing_error']}") | |
| print(f"Raw: {result['raw'].content}") | |
| else: | |
| print(f"Parsed: {result['parsed']}") | |
| ```python | |
| from pydantic import BaseModel, Field | |
| from langchain_gradient import ChatGradient | |
| class Person(BaseModel): | |
| name: str = Field(description="The person's name") | |
| age: int = Field(description="The person's age") | |
| email: str = Field(description="Email address") | |
| llm = ChatGradient(model="llama3.3-70b-instruct") | |
| structured_llm = llm.with_structured_output(Person) | |
| response = structured_llm.invoke( | |
| "Create a person named John, age 30, email john@example.com" | |
| ) | |
| print(response) |
Output: Person(name='John', age=30, email='john@example.com')
Multiple Objects
from typing import List
class PersonList(BaseModel):
people: List[Person]
structured_llm = llm.with_structured_output(PersonList)
response = structured_llm.invoke("Create 3 people with different names")
print(response.people) # List of Person objectsError Handling
Get raw output and errors
structured_llm = llm.with_structured_output(Person, include_raw=True)
result = structured_llm.invoke("Create a person")
if result["parsing_error"]:
print(f"Error: {result['parsing_error']}")
print(f"Raw: {result['raw'].content}")
else:
print(f"Parsed: {result['parsed']}")|
|
||
| ## Error Handling | ||
|
|
||
| Get raw output and errors |
There was a problem hiding this comment.
This line should be formatted as a comment or descriptive text (e.g., '# Get raw output and errors') rather than appearing as standalone text outside a code block.
| Get raw output and errors | |
| **Get raw output and errors** |
Solved #8