# Pydantic AI > GenAI Agent Framework, the Pydantic way Pydantic AI is a Python agent framework designed to make it less painful to build production grade applications with Generative AI. # Concepts documentation ## Introduction Agents are Pydantic AI's primary interface for interacting with LLMs. In some use cases a single Agent will control an entire application or component, but multiple agents can also interact to embody more complex workflows. The Agent class has full API documentation, but conceptually you can think of an agent as a container for: | **Component** | **Description** | | --- | --- | | [Instructions](#instructions) | A set of instructions for the LLM written by the developer. | | [Function tool(s)](../tools/) and [toolsets](../toolsets/) | Functions that the LLM may call to get information while generating a response. | | [Structured output type](../output/) | The structured datatype the LLM must return at the end of a run, if specified. | | [Dependency type constraint](../dependencies/) | Dynamic instructions functions, tools, and output functions may all use dependencies when they're run. | | [LLM model](../api/models/base/) | Optional default LLM model associated with the agent. Can also be specified when running the agent. | | [Model Settings](#additional-configuration) | Optional default model settings to help fine tune requests. Can also be specified when running the agent. | In typing terms, agents are generic in their dependency and output types, e.g., an agent which required dependencies of type `Foobar` and produced outputs of type `list[str]` would have type `Agent[Foobar, list[str]]`. In practice, you shouldn't need to care about this, it should just mean your IDE can tell you when you have the right type, and if you choose to use [static type checking](#static-type-checking) it should work well with Pydantic AI. Here's a toy example of an agent that simulates a roulette wheel: roulette_wheel.py ```python from pydantic_ai import Agent, RunContext roulette_agent = Agent( # (1)! 'openai:gpt-4o', deps_type=int, output_type=bool, system_prompt=( 'Use the `roulette_wheel` function to see if the ' 'customer has won based on the number they provide.' ), ) @roulette_agent.tool async def roulette_wheel(ctx: RunContext[int], square: int) -> str: # (2)! """check if the square is a winner""" return 'winner' if square == ctx.deps else 'loser' # Run the agent success_number = 18 # (3)! result = roulette_agent.run_sync('Put my money on square eighteen', deps=success_number) print(result.output) # (4)! #> True result = roulette_agent.run_sync('I bet five is the winner', deps=success_number) print(result.output) #> False ``` 1. Create an agent, which expects an integer dependency and produces a boolean output. This agent will have type `Agent[int, bool]`. 1. Define a tool that checks if the square is a winner. Here RunContext is parameterized with the dependency type `int`; if you got the dependency type wrong you'd get a typing error. 1. In reality, you might want to use a random number here e.g. `random.randint(0, 36)`. 1. `result.output` will be a boolean indicating if the square is a winner. Pydantic performs the output validation, and it'll be typed as a `bool` since its type is derived from the `output_type` generic parameter of the agent. Agents are designed for reuse, like FastAPI Apps Agents are intended to be instantiated once (frequently as module globals) and reused throughout your application, similar to a small FastAPI app or an APIRouter. ## Running Agents There are four ways to run an agent: 1. agent.run() — an async function which returns a RunResult containing a completed response. 1. agent.run_sync() — a plain, synchronous function which returns a RunResult containing a completed response (internally, this just calls `loop.run_until_complete(self.run())`). 1. agent.run_stream() — an async context manager which returns a StreamedRunResult, which contains methods to stream text and structured output as an async iterable. 1. agent.iter() — a context manager which returns an AgentRun, an async-iterable over the nodes of the agent's underlying Graph. Here's a simple example demonstrating the first three: run_agent.py ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') result_sync = agent.run_sync('What is the capital of Italy?') print(result_sync.output) #> The capital of Italy is Rome. async def main(): result = await agent.run('What is the capital of France?') print(result.output) #> The capital of France is Paris. async with agent.run_stream('What is the capital of the UK?') as response: async for text in response.stream_text(): print(text) #> The capital of #> The capital of the UK is #> The capital of the UK is London. ``` *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* You can also pass messages from previous runs to continue a conversation or provide context, as described in [Messages and Chat History](../message-history/). ### Streaming Events and Final Output As shown in the example above, run_stream() makes it easy to stream the agent's final output as it comes in. It also takes an optional `event_stream_handler` argument that you can use to gain insight into what is happening during the run before the final output is produced. The example below shows how to stream events and text output. You can also [stream structured output](../output/#streaming-structured-output). Note As the `run_stream()` method will consider the first output matching the `output_type` to be the final output, it will stop running the agent graph and will not execute any tool calls made by the model after this "final" output. If you want to always run the agent graph to completion and stream all events from the model's streaming response and the agent's execution of tools, use agent.run() with an `event_stream_handler` or agent.iter() instead, as described in the following sections. run_stream_events.py ```python import asyncio from collections.abc import AsyncIterable from datetime import date from pydantic_ai import Agent, RunContext from pydantic_ai.messages import ( AgentStreamEvent, FinalResultEvent, FunctionToolCallEvent, FunctionToolResultEvent, PartDeltaEvent, PartStartEvent, TextPartDelta, ThinkingPartDelta, ToolCallPartDelta, ) weather_agent = Agent( 'openai:gpt-4o', system_prompt='Providing a weather forecast at the locations the user provides.', ) @weather_agent.tool async def weather_forecast( ctx: RunContext, location: str, forecast_date: date, ) -> str: return f'The forecast in {location} on {forecast_date} is 24°C and sunny.' output_messages: list[str] = [] async def event_stream_handler( ctx: RunContext, event_stream: AsyncIterable[AgentStreamEvent], ): async for event in event_stream: if isinstance(event, PartStartEvent): output_messages.append(f'[Request] Starting part {event.index}: {event.part!r}') elif isinstance(event, PartDeltaEvent): if isinstance(event.delta, TextPartDelta): output_messages.append(f'[Request] Part {event.index} text delta: {event.delta.content_delta!r}') elif isinstance(event.delta, ThinkingPartDelta): output_messages.append(f'[Request] Part {event.index} thinking delta: {event.delta.content_delta!r}') elif isinstance(event.delta, ToolCallPartDelta): output_messages.append(f'[Request] Part {event.index} args delta: {event.delta.args_delta}') elif isinstance(event, FunctionToolCallEvent): output_messages.append( f'[Tools] The LLM calls tool={event.part.tool_name!r} with args={event.part.args} (tool_call_id={event.part.tool_call_id!r})' ) elif isinstance(event, FunctionToolResultEvent): output_messages.append(f'[Tools] Tool call {event.tool_call_id!r} returned => {event.result.content}') elif isinstance(event, FinalResultEvent): output_messages.append(f'[Result] The model starting producing a final result (tool_name={event.tool_name})') async def main(): user_prompt = 'What will the weather be like in Paris on Tuesday?' async with weather_agent.run_stream(user_prompt, event_stream_handler=event_stream_handler) as run: async for output in run.stream_text(): output_messages.append(f'[Output] {output}') if __name__ == '__main__': asyncio.run(main()) print(output_messages) """ [ "[Request] Starting part 0: ToolCallPart(tool_name='weather_forecast', tool_call_id='0001')", '[Request] Part 0 args delta: {"location":"Pa', '[Request] Part 0 args delta: ris","forecast_', '[Request] Part 0 args delta: date":"2030-01-', '[Request] Part 0 args delta: 01"}', '[Tools] The LLM calls tool=\'weather_forecast\' with args={"location":"Paris","forecast_date":"2030-01-01"} (tool_call_id=\'0001\')', "[Tools] Tool call '0001' returned => The forecast in Paris on 2030-01-01 is 24°C and sunny.", "[Request] Starting part 0: TextPart(content='It will be ')", '[Result] The model starting producing a final result (tool_name=None)', '[Output] It will be ', '[Output] It will be warm and sunny ', '[Output] It will be warm and sunny in Paris on ', '[Output] It will be warm and sunny in Paris on Tuesday.', ] """ ``` ### Streaming All Events Like `agent.run_stream()`, agent.run() takes an optional `event_stream_handler` argument that lets you stream all events from the model's streaming response and the agent's execution of tools. Unlike `run_stream()`, it always runs the agent graph to completion even if text was received ahead of tool calls that looked like it could've been the final result. Note When used with an `event_stream_handler`, the `run()` method currently requires you to piece together the streamed text yourself from the `PartStartEvent` and subsequent `PartDeltaEvent`s instead of providing a `stream_text()` convenience method. To get the best of both worlds, at the expense of some additional complexity, you can use agent.iter() as described in the next section, which lets you [iterate over the agent graph](#iterating-over-an-agents-graph) and [stream both events and output](#streaming-all-events-and-output) at every step. run_events.py ```python import asyncio from run_stream_events import event_stream_handler, output_messages, weather_agent async def main(): user_prompt = 'What will the weather be like in Paris on Tuesday?' run = await weather_agent.run(user_prompt, event_stream_handler=event_stream_handler) output_messages.append(f'[Final Output] {run.output}') if __name__ == '__main__': asyncio.run(main()) print(output_messages) """ [ "[Request] Starting part 0: ToolCallPart(tool_name='weather_forecast', tool_call_id='0001')", '[Request] Part 0 args delta: {"location":"Pa', '[Request] Part 0 args delta: ris","forecast_', '[Request] Part 0 args delta: date":"2030-01-', '[Request] Part 0 args delta: 01"}', '[Tools] The LLM calls tool=\'weather_forecast\' with args={"location":"Paris","forecast_date":"2030-01-01"} (tool_call_id=\'0001\')', "[Tools] Tool call '0001' returned => The forecast in Paris on 2030-01-01 is 24°C and sunny.", "[Request] Starting part 0: TextPart(content='It will be ')", '[Result] The model starting producing a final result (tool_name=None)', "[Request] Part 0 text delta: 'warm and sunny '", "[Request] Part 0 text delta: 'in Paris on '", "[Request] Part 0 text delta: 'Tuesday.'", '[Final Output] It will be warm and sunny in Paris on Tuesday.', ] """ ``` *(This example is complete, it can be run "as is")* ### Iterating Over an Agent's Graph Under the hood, each `Agent` in Pydantic AI uses **pydantic-graph** to manage its execution flow. **pydantic-graph** is a generic, type-centric library for building and running finite state machines in Python. It doesn't actually depend on Pydantic AI — you can use it standalone for workflows that have nothing to do with GenAI — but Pydantic AI makes use of it to orchestrate the handling of model requests and model responses in an agent's run. In many scenarios, you don't need to worry about pydantic-graph at all; calling `agent.run(...)` simply traverses the underlying graph from start to finish. However, if you need deeper insight or control — for example to inject your own logic at specific stages — Pydantic AI exposes the lower-level iteration process via Agent.iter. This method returns an AgentRun, which you can async-iterate over, or manually drive node-by-node via the next method. Once the agent's graph returns an End, you have the final result along with a detailed history of all steps. #### `async for` iteration Here's an example of using `async for` with `iter` to record each node the agent executes: agent_iter_async_for.py ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): nodes = [] # Begin an AgentRun, which is an async-iterable over the nodes of the agent's graph async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: # Each node represents a step in the agent's execution nodes.append(node) print(nodes) """ [ UserPromptNode( user_prompt='What is the capital of France?', instructions=None, instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] """ print(agent_run.result.output) #> The capital of France is Paris. ``` *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* - The `AgentRun` is an async iterator that yields each node (`BaseNode` or `End`) in the flow. - The run ends when an `End` node is returned. #### Using `.next(...)` manually You can also drive the iteration manually by passing the node you want to run next to the `AgentRun.next(...)` method. This allows you to inspect or modify the node before it executes or skip nodes based on your own logic, and to catch errors in `next()` more easily: agent_iter_next.py ```python from pydantic_ai import Agent from pydantic_graph import End agent = Agent('openai:gpt-4o') async def main(): async with agent.iter('What is the capital of France?') as agent_run: node = agent_run.next_node # (1)! all_nodes = [node] # Drive the iteration manually: while not isinstance(node, End): # (2)! node = await agent_run.next(node) # (3)! all_nodes.append(node) # (4)! print(all_nodes) """ [ UserPromptNode( user_prompt='What is the capital of France?', instructions=None, instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] """ ``` 1. We start by grabbing the first node that will be run in the agent's graph. 1. The agent run is finished once an `End` node has been produced; instances of `End` cannot be passed to `next`. 1. When you call `await agent_run.next(node)`, it executes that node in the agent's graph, updates the run's history, and returns the *next* node to run. 1. You could also inspect or mutate the new `node` here as needed. *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* #### Accessing usage and final output You can retrieve usage statistics (tokens, requests, etc.) at any time from the AgentRun object via `agent_run.usage()`. This method returns a RunUsage object containing the usage data. Once the run finishes, `agent_run.result` becomes a AgentRunResult object containing the final output (and related metadata). #### Streaming All Events and Output Here is an example of streaming an agent run in combination with `async for` iteration: streaming_iter.py ```python import asyncio from dataclasses import dataclass from datetime import date from pydantic_ai import Agent, RunContext from pydantic_ai.messages import ( FinalResultEvent, FunctionToolCallEvent, FunctionToolResultEvent, PartDeltaEvent, PartStartEvent, TextPartDelta, ThinkingPartDelta, ToolCallPartDelta, ) @dataclass class WeatherService: async def get_forecast(self, location: str, forecast_date: date) -> str: # In real code: call weather API, DB queries, etc. return f'The forecast in {location} on {forecast_date} is 24°C and sunny.' async def get_historic_weather(self, location: str, forecast_date: date) -> str: # In real code: call a historical weather API or DB return f'The weather in {location} on {forecast_date} was 18°C and partly cloudy.' weather_agent = Agent[WeatherService, str]( 'openai:gpt-4o', deps_type=WeatherService, output_type=str, # We'll produce a final answer as plain text system_prompt='Providing a weather forecast at the locations the user provides.', ) @weather_agent.tool async def weather_forecast( ctx: RunContext[WeatherService], location: str, forecast_date: date, ) -> str: if forecast_date >= date.today(): return await ctx.deps.get_forecast(location, forecast_date) else: return await ctx.deps.get_historic_weather(location, forecast_date) output_messages: list[str] = [] async def main(): user_prompt = 'What will the weather be like in Paris on Tuesday?' # Begin a node-by-node, streaming iteration async with weather_agent.iter(user_prompt, deps=WeatherService()) as run: async for node in run: if Agent.is_user_prompt_node(node): # A user prompt node => The user has provided input output_messages.append(f'=== UserPromptNode: {node.user_prompt} ===') elif Agent.is_model_request_node(node): # A model request node => We can stream tokens from the model's request output_messages.append('=== ModelRequestNode: streaming partial request tokens ===') async with node.stream(run.ctx) as request_stream: final_result_found = False async for event in request_stream: if isinstance(event, PartStartEvent): output_messages.append(f'[Request] Starting part {event.index}: {event.part!r}') elif isinstance(event, PartDeltaEvent): if isinstance(event.delta, TextPartDelta): output_messages.append( f'[Request] Part {event.index} text delta: {event.delta.content_delta!r}' ) elif isinstance(event.delta, ThinkingPartDelta): output_messages.append( f'[Request] Part {event.index} thinking delta: {event.delta.content_delta!r}' ) elif isinstance(event.delta, ToolCallPartDelta): output_messages.append( f'[Request] Part {event.index} args delta: {event.delta.args_delta}' ) elif isinstance(event, FinalResultEvent): output_messages.append( f'[Result] The model started producing a final result (tool_name={event.tool_name})' ) final_result_found = True break if final_result_found: # Once the final result is found, we can call `AgentStream.stream_text()` to stream the text. # A similar `AgentStream.stream_output()` method is available to stream structured output. async for output in request_stream.stream_text(): output_messages.append(f'[Output] {output}') elif Agent.is_call_tools_node(node): # A handle-response node => The model returned some data, potentially calls a tool output_messages.append('=== CallToolsNode: streaming partial response & tool usage ===') async with node.stream(run.ctx) as handle_stream: async for event in handle_stream: if isinstance(event, FunctionToolCallEvent): output_messages.append( f'[Tools] The LLM calls tool={event.part.tool_name!r} with args={event.part.args} (tool_call_id={event.part.tool_call_id!r})' ) elif isinstance(event, FunctionToolResultEvent): output_messages.append( f'[Tools] Tool call {event.tool_call_id!r} returned => {event.result.content}' ) elif Agent.is_end_node(node): # Once an End node is reached, the agent run is complete assert run.result is not None assert run.result.output == node.data.output output_messages.append(f'=== Final Agent Output: {run.result.output} ===') if __name__ == '__main__': asyncio.run(main()) print(output_messages) """ [ '=== UserPromptNode: What will the weather be like in Paris on Tuesday? ===', '=== ModelRequestNode: streaming partial request tokens ===', "[Request] Starting part 0: ToolCallPart(tool_name='weather_forecast', tool_call_id='0001')", '[Request] Part 0 args delta: {"location":"Pa', '[Request] Part 0 args delta: ris","forecast_', '[Request] Part 0 args delta: date":"2030-01-', '[Request] Part 0 args delta: 01"}', '=== CallToolsNode: streaming partial response & tool usage ===', '[Tools] The LLM calls tool=\'weather_forecast\' with args={"location":"Paris","forecast_date":"2030-01-01"} (tool_call_id=\'0001\')', "[Tools] Tool call '0001' returned => The forecast in Paris on 2030-01-01 is 24°C and sunny.", '=== ModelRequestNode: streaming partial request tokens ===', "[Request] Starting part 0: TextPart(content='It will be ')", '[Result] The model started producing a final result (tool_name=None)', '[Output] It will be ', '[Output] It will be warm and sunny ', '[Output] It will be warm and sunny in Paris on ', '[Output] It will be warm and sunny in Paris on Tuesday.', '=== CallToolsNode: streaming partial response & tool usage ===', '=== Final Agent Output: It will be warm and sunny in Paris on Tuesday. ===', ] """ ``` *(This example is complete, it can be run "as is")* ### Additional Configuration #### Usage Limits Pydantic AI offers a UsageLimits structure to help you limit your usage (tokens, requests, and tool calls) on model runs. You can apply these settings by passing the `usage_limits` argument to the `run{_sync,_stream}` functions. Consider the following example, where we limit the number of response tokens: ```py from pydantic_ai import Agent, UsageLimitExceeded, UsageLimits agent = Agent('anthropic:claude-3-5-sonnet-latest') result_sync = agent.run_sync( 'What is the capital of Italy? Answer with just the city.', usage_limits=UsageLimits(response_tokens_limit=10), ) print(result_sync.output) #> Rome print(result_sync.usage()) #> RunUsage(input_tokens=62, output_tokens=1, requests=1) try: result_sync = agent.run_sync( 'What is the capital of Italy? Answer with a paragraph.', usage_limits=UsageLimits(response_tokens_limit=10), ) except UsageLimitExceeded as e: print(e) #> Exceeded the output_tokens_limit of 10 (output_tokens=32) ``` Restricting the number of requests can be useful in preventing infinite loops or excessive tool calling: ```py from typing_extensions import TypedDict from pydantic_ai import Agent, ModelRetry, UsageLimitExceeded, UsageLimits class NeverOutputType(TypedDict): """ Never ever coerce data to this type. """ never_use_this: str agent = Agent( 'anthropic:claude-3-5-sonnet-latest', retries=3, output_type=NeverOutputType, system_prompt='Any time you get a response, call the `infinite_retry_tool` to produce another response.', ) @agent.tool_plain(retries=5) # (1)! def infinite_retry_tool() -> int: raise ModelRetry('Please try again.') try: result_sync = agent.run_sync( 'Begin infinite retry loop!', usage_limits=UsageLimits(request_limit=3) # (2)! ) except UsageLimitExceeded as e: print(e) #> The next request would exceed the request_limit of 3 ``` 1. This tool has the ability to retry 5 times before erroring, simulating a tool that might get stuck in a loop. 1. This run will error after 3 requests, preventing the infinite tool calling. ##### Capping tool calls If you need a limit on the number of successful tool invocations within a single run, use `tool_calls_limit`: ```py from pydantic_ai import Agent from pydantic_ai.exceptions import UsageLimitExceeded from pydantic_ai.usage import UsageLimits agent = Agent('anthropic:claude-3-5-sonnet-latest') @agent.tool_plain def do_work() -> str: return 'ok' try: # Allow at most one executed tool call in this run agent.run_sync('Please call the tool twice', usage_limits=UsageLimits(tool_calls_limit=1)) except UsageLimitExceeded as e: print(e) #> The next tool call would exceed the tool_calls_limit of 1 (tool_calls=1) ``` Note - Usage limits are especially relevant if you've registered many tools. Use `request_limit` to bound the number of model turns, and `tool_calls_limit` to cap the number of successful tool executions within a run. - These limits are enforced at the final stage before the LLM is called. If your limits are stricter than your retry settings, the usage limit will be reached before all retries are attempted. #### Model (Run) Settings Pydantic AI offers a settings.ModelSettings structure to help you fine tune your requests. This structure allows you to configure common parameters that influence the model's behavior, such as `temperature`, `max_tokens`, `timeout`, and more. There are three ways to apply these settings, with a clear precedence order: 1. **Model-level defaults** - Set when creating a model instance via the `settings` parameter. These serve as the base defaults for that model. 1. **Agent-level defaults** - Set during Agent initialization via the `model_settings` argument. These are merged with model defaults, with agent settings taking precedence. 1. **Run-time overrides** - Passed to `run{_sync,_stream}` functions via the `model_settings` argument. These have the highest priority and are merged with the combined agent and model defaults. For example, if you'd like to set the `temperature` setting to `0.0` to ensure less random behavior, you can do the following: ```py from pydantic_ai import Agent, ModelSettings from pydantic_ai.models.openai import OpenAIChatModel # 1. Model-level defaults model = OpenAIChatModel( 'gpt-4o', settings=ModelSettings(temperature=0.8, max_tokens=500) # Base defaults ) # 2. Agent-level defaults (overrides model defaults by merging) agent = Agent(model, model_settings=ModelSettings(temperature=0.5)) # 3. Run-time overrides (highest priority) result_sync = agent.run_sync( 'What is the capital of Italy?', model_settings=ModelSettings(temperature=0.0) # Final temperature: 0.0 ) print(result_sync.output) #> The capital of Italy is Rome. ``` The final request uses `temperature=0.0` (run-time), `max_tokens=500` (from model), demonstrating how settings merge with run-time taking precedence. Model Settings Support Model-level settings are supported by all concrete model implementations (OpenAI, Anthropic, Google, etc.). Wrapper models like `FallbackModel`, `WrapperModel`, and `InstrumentedModel` don't have their own settings - they use the settings of their underlying models. ### Model specific settings If you wish to further customize model behavior, you can use a subclass of ModelSettings, like GoogleModelSettings, associated with your model of choice. For example: ```py from pydantic_ai import Agent, UnexpectedModelBehavior from pydantic_ai.models.google import GoogleModelSettings agent = Agent('google-gla:gemini-1.5-flash') try: result = agent.run_sync( 'Write a list of 5 very rude things that I might say to the universe after stubbing my toe in the dark:', model_settings=GoogleModelSettings( temperature=0.0, # general model settings can also be specified gemini_safety_settings=[ { 'category': 'HARM_CATEGORY_HARASSMENT', 'threshold': 'BLOCK_LOW_AND_ABOVE', }, { 'category': 'HARM_CATEGORY_HATE_SPEECH', 'threshold': 'BLOCK_LOW_AND_ABOVE', }, ], ), ) except UnexpectedModelBehavior as e: print(e) # (1)! """ Safety settings triggered, body: """ ``` 1. This error is raised because the safety thresholds were exceeded. ## Runs vs. Conversations An agent **run** might represent an entire conversation — there's no limit to how many messages can be exchanged in a single run. However, a **conversation** might also be composed of multiple runs, especially if you need to maintain state between separate interactions or API calls. Here's an example of a conversation comprised of multiple runs: conversation_example.py ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') # First run result1 = agent.run_sync('Who was Albert Einstein?') print(result1.output) #> Albert Einstein was a German-born theoretical physicist. # Second run, passing previous messages result2 = agent.run_sync( 'What was his most famous equation?', message_history=result1.new_messages(), # (1)! ) print(result2.output) #> Albert Einstein's most famous equation is (E = mc^2). ``` 1. Continue the conversation; without `message_history` the model would not know who "his" was referring to. *(This example is complete, it can be run "as is")* ## Type safe by design Pydantic AI is designed to work well with static type checkers, like mypy and pyright. Typing is (somewhat) optional Pydantic AI is designed to make type checking as useful as possible for you if you choose to use it, but you don't have to use types everywhere all the time. That said, because Pydantic AI uses Pydantic, and Pydantic uses type hints as the definition for schema and validation, some types (specifically type hints on parameters to tools, and the `output_type` arguments to Agent) are used at runtime. We (the library developers) have messed up if type hints are confusing you more than helping you, if you find this, please create an [issue](https://github.com/pydantic/pydantic-ai/issues) explaining what's annoying you! In particular, agents are generic in both the type of their dependencies and the type of the outputs they return, so you can use the type hints to ensure you're using the right types. Consider the following script with type mistakes: type_mistakes.py ```python from dataclasses import dataclass from pydantic_ai import Agent, RunContext @dataclass class User: name: str agent = Agent( 'test', deps_type=User, # (1)! output_type=bool, ) @agent.system_prompt def add_user_name(ctx: RunContext[str]) -> str: # (2)! return f"The user's name is {ctx.deps}." def foobar(x: bytes) -> None: pass result = agent.run_sync('Does their name start with "A"?', deps=User('Anne')) foobar(result.output) # (3)! ``` 1. The agent is defined as expecting an instance of `User` as `deps`. 1. But here `add_user_name` is defined as taking a `str` as the dependency, not a `User`. 1. Since the agent is defined as returning a `bool`, this will raise a type error since `foobar` expects `bytes`. Running `mypy` on this will give the following output: ```bash ➤ uv run mypy type_mistakes.py type_mistakes.py:18: error: Argument 1 to "system_prompt" of "Agent" has incompatible type "Callable[[RunContext[str]], str]"; expected "Callable[[RunContext[User]], str]" [arg-type] type_mistakes.py:28: error: Argument 1 to "foobar" has incompatible type "bool"; expected "bytes" [arg-type] Found 2 errors in 1 file (checked 1 source file) ``` Running `pyright` would identify the same issues. ## System Prompts System prompts might seem simple at first glance since they're just strings (or sequences of strings that are concatenated), but crafting the right system prompt is key to getting the model to behave as you want. Tip For most use cases, you should use `instructions` instead of "system prompts". If you know what you are doing though and want to preserve system prompt messages in the message history sent to the LLM in subsequent completions requests, you can achieve this using the `system_prompt` argument/decorator. See the section below on [Instructions](#instructions) for more information. Generally, system prompts fall into two categories: 1. **Static system prompts**: These are known when writing the code and can be defined via the `system_prompt` parameter of the Agent constructor. 1. **Dynamic system prompts**: These depend in some way on context that isn't known until runtime, and should be defined via functions decorated with @agent.system_prompt. You can add both to a single agent; they're appended in the order they're defined at runtime. Here's an example using both types of system prompts: system_prompts.py ```python from datetime import date from pydantic_ai import Agent, RunContext agent = Agent( 'openai:gpt-4o', deps_type=str, # (1)! system_prompt="Use the customer's name while replying to them.", # (2)! ) @agent.system_prompt # (3)! def add_the_users_name(ctx: RunContext[str]) -> str: return f"The user's name is {ctx.deps}." @agent.system_prompt def add_the_date() -> str: # (4)! return f'The date is {date.today()}.' result = agent.run_sync('What is the date?', deps='Frank') print(result.output) #> Hello Frank, the date today is 2032-01-02. ``` 1. The agent expects a string dependency. 1. Static system prompt defined at agent creation time. 1. Dynamic system prompt defined via a decorator with RunContext, this is called just after `run_sync`, not when the agent is created, so can benefit from runtime information like the dependencies used on that run. 1. Another dynamic system prompt, system prompts don't have to have the `RunContext` parameter. *(This example is complete, it can be run "as is")* ## Instructions Instructions are similar to system prompts. The main difference is that when an explicit `message_history` is provided in a call to `Agent.run` and similar methods, *instructions* from any existing messages in the history are not included in the request to the model — only the instructions of the *current* agent are included. You should use: - `instructions` when you want your request to the model to only include system prompts for the *current* agent - `system_prompt` when you want your request to the model to *retain* the system prompts used in previous requests (possibly made using other agents) In general, we recommend using `instructions` instead of `system_prompt` unless you have a specific reason to use `system_prompt`. Instructions, like system prompts, fall into two categories: 1. **Static instructions**: These are known when writing the code and can be defined via the `instructions` parameter of the Agent constructor. 1. **Dynamic instructions**: These rely on context that is only available at runtime and should be defined using functions decorated with @agent.instructions. Unlike dynamic system prompts, which may be reused when `message_history` is present, dynamic instructions are always reevaluated. Both static and dynamic instructions can be added to a single agent, and they are appended in the order they are defined at runtime. Here's an example using both types of instructions: instructions.py ```python from datetime import date from pydantic_ai import Agent, RunContext agent = Agent( 'openai:gpt-4o', deps_type=str, # (1)! instructions="Use the customer's name while replying to them.", # (2)! ) @agent.instructions # (3)! def add_the_users_name(ctx: RunContext[str]) -> str: return f"The user's name is {ctx.deps}." @agent.instructions def add_the_date() -> str: # (4)! return f'The date is {date.today()}.' result = agent.run_sync('What is the date?', deps='Frank') print(result.output) #> Hello Frank, the date today is 2032-01-02. ``` 1. The agent expects a string dependency. 1. Static instructions defined at agent creation time. 1. Dynamic instructions defined via a decorator with RunContext, this is called just after `run_sync`, not when the agent is created, so can benefit from runtime information like the dependencies used on that run. 1. Another dynamic instruction, instructions don't have to have the `RunContext` parameter. *(This example is complete, it can be run "as is")* Note that returning an empty string will result in no instruction message added. ## Reflection and self-correction Validation errors from both function tool parameter validation and [structured output validation](../output/#structured-output) can be passed back to the model with a request to retry. You can also raise ModelRetry from within a [tool](../tools/) or [output function](../output/#output-functions) to tell the model it should retry generating a response. - The default retry count is **1** but can be altered for the entire agent, a specific tool, or outputs. - You can access the current retry count from within a tool or output function via ctx.retry. Here's an example: tool_retry.py ```python from pydantic import BaseModel from pydantic_ai import Agent, RunContext, ModelRetry from fake_database import DatabaseConn class ChatResult(BaseModel): user_id: int message: str agent = Agent( 'openai:gpt-4o', deps_type=DatabaseConn, output_type=ChatResult, ) @agent.tool(retries=2) def get_user_by_name(ctx: RunContext[DatabaseConn], name: str) -> int: """Get a user's ID from their full name.""" print(name) #> John #> John Doe user_id = ctx.deps.users.get(name=name) if user_id is None: raise ModelRetry( f'No user found with name {name!r}, remember to provide their full name' ) return user_id result = agent.run_sync( 'Send a message to John Doe asking for coffee next week', deps=DatabaseConn() ) print(result.output) """ user_id=123 message='Hello John, would you be free for coffee sometime next week? Let me know what works for you!' """ ``` ## Model errors If models behave unexpectedly (e.g., the retry limit is exceeded, or their API returns `503`), agent runs will raise UnexpectedModelBehavior. In these cases, capture_run_messages can be used to access the messages exchanged during the run to help diagnose the issue. agent_model_errors.py ```python from pydantic_ai import Agent, ModelRetry, UnexpectedModelBehavior, capture_run_messages agent = Agent('openai:gpt-4o') @agent.tool_plain def calc_volume(size: int) -> int: # (1)! if size == 42: return size**3 else: raise ModelRetry('Please try again.') with capture_run_messages() as messages: # (2)! try: result = agent.run_sync('Please get me the volume of a box with size 6.') except UnexpectedModelBehavior as e: print('An error occurred:', e) #> An error occurred: Tool 'calc_volume' exceeded max retries count of 1 print('cause:', repr(e.__cause__)) #> cause: ModelRetry('Please try again.') print('messages:', messages) """ messages: [ ModelRequest( parts=[ UserPromptPart( content='Please get me the volume of a box with size 6.', timestamp=datetime.datetime(...), ) ] ), ModelResponse( parts=[ ToolCallPart( tool_name='calc_volume', args={'size': 6}, tool_call_id='pyd_ai_tool_call_id', ) ], usage=RequestUsage(input_tokens=62, output_tokens=4), model_name='gpt-4o', timestamp=datetime.datetime(...), ), ModelRequest( parts=[ RetryPromptPart( content='Please try again.', tool_name='calc_volume', tool_call_id='pyd_ai_tool_call_id', timestamp=datetime.datetime(...), ) ] ), ModelResponse( parts=[ ToolCallPart( tool_name='calc_volume', args={'size': 6}, tool_call_id='pyd_ai_tool_call_id', ) ], usage=RequestUsage(input_tokens=72, output_tokens=8), model_name='gpt-4o', timestamp=datetime.datetime(...), ), ] """ else: print(result.output) ``` 1. Define a tool that will raise `ModelRetry` repeatedly in this case. 1. capture_run_messages is used to capture the messages exchanged during the run. *(This example is complete, it can be run "as is")* Note If you call run, run_sync, or run_stream more than once within a single `capture_run_messages` context, `messages` will represent the messages exchanged during the first call only. # Common Tools Pydantic AI ships with native tools that can be used to enhance your agent's capabilities. ## DuckDuckGo Search Tool The DuckDuckGo search tool allows you to search the web for information. It is built on top of the [DuckDuckGo API](https://github.com/deedy5/ddgs). ### Installation To use duckduckgo_search_tool, you need to install [`pydantic-ai-slim`](../install/#slim-install) with the `duckduckgo` optional group: ```bash pip install "pydantic-ai-slim[duckduckgo]" ``` ```bash uv add "pydantic-ai-slim[duckduckgo]" ``` ### Usage Here's an example of how you can use the DuckDuckGo search tool with an agent: duckduckgo_search.py ```py from pydantic_ai import Agent from pydantic_ai.common_tools.duckduckgo import duckduckgo_search_tool agent = Agent( 'openai:o3-mini', tools=[duckduckgo_search_tool()], system_prompt='Search DuckDuckGo for the given query and return the results.', ) result = agent.run_sync( 'Can you list the top five highest-grossing animated films of 2025?' ) print(result.output) """ I looked into several sources on animated box‐office performance in 2025, and while detailed rankings can shift as more money is tallied, multiple independent reports have already highlighted a couple of record‐breaking shows. For example: • Ne Zha 2 – News outlets (Variety, Wikipedia's "List of animated feature films of 2025", and others) have reported that this Chinese title not only became the highest‑grossing animated film of 2025 but also broke records as the highest‑grossing non‑English animated film ever. One article noted its run exceeded US$1.7 billion. • Inside Out 2 – According to data shared on Statista and in industry news, this Pixar sequel has been on pace to set new records (with some sources even noting it as the highest‑grossing animated film ever, as of January 2025). Beyond those two, some entertainment trade sites (for example, a Just Jared article titled "Top 10 Highest-Earning Animated Films at the Box Office Revealed") have begun listing a broader top‑10. Although full consolidated figures can sometimes differ by source and are updated daily during a box‑office run, many of the industry trackers have begun to single out five films as the biggest earners so far in 2025. Unfortunately, although multiple articles discuss the "top animated films" of 2025, there isn't yet a single, universally accepted list with final numbers that names the complete top five. (Box‑office rankings, especially mid‑year, can be fluid as films continue to add to their totals.) Based on what several sources note so far, the two undisputed leaders are: 1. Ne Zha 2 2. Inside Out 2 The remaining top spots (3–5) are reported by some outlets in their "Top‑10 Animated Films" lists for 2025 but the titles and order can vary depending on the source and the exact cut‑off date of the data. For the most up‑to‑date and detailed ranking (including the 3rd, 4th, and 5th highest‑grossing films), I recommend checking resources like: • Wikipedia's "List of animated feature films of 2025" page • Box‑office tracking sites (such as Box Office Mojo or The Numbers) • Trade articles like the one on Just Jared To summarize with what is clear from the current reporting: 1. Ne Zha 2 2. Inside Out 2 3–5. Other animated films (yet to be definitively finalized across all reporting outlets) If you're looking for a final, consensus list of the top five, it may be best to wait until the 2025 year‑end box‑office tallies are in or to consult a regularly updated entertainment industry source. Would you like help finding a current source or additional details on where to look for the complete updated list? """ ``` ## Tavily Search Tool Info Tavily is a paid service, but they have free credits to explore their product. You need to [sign up for an account](https://app.tavily.com/home) and get an API key to use the Tavily search tool. The Tavily search tool allows you to search the web for information. It is built on top of the [Tavily API](https://tavily.com/). ### Installation To use tavily_search_tool, you need to install [`pydantic-ai-slim`](../install/#slim-install) with the `tavily` optional group: ```bash pip install "pydantic-ai-slim[tavily]" ``` ```bash uv add "pydantic-ai-slim[tavily]" ``` ### Usage Here's an example of how you can use the Tavily search tool with an agent: tavily_search.py ```py import os from pydantic_ai import Agent from pydantic_ai.common_tools.tavily import tavily_search_tool api_key = os.getenv('TAVILY_API_KEY') assert api_key is not None agent = Agent( 'openai:o3-mini', tools=[tavily_search_tool(api_key)], system_prompt='Search Tavily for the given query and return the results.', ) result = agent.run_sync('Tell me the top news in the GenAI world, give me links.') print(result.output) """ Here are some of the top recent news articles related to GenAI: 1. How CLEAR users can improve risk analysis with GenAI – Thomson Reuters Read more: https://legal.thomsonreuters.com/blog/how-clear-users-can-improve-risk-analysis-with-genai/ (This article discusses how CLEAR's new GenAI-powered tool streamlines risk analysis by quickly summarizing key information from various public data sources.) 2. TELUS Digital Survey Reveals Enterprise Employees Are Entering Sensitive Data Into AI Assistants More Than You Think – FT.com Read more: https://markets.ft.com/data/announce/detail?dockey=600-202502260645BIZWIRE_USPRX____20250226_BW490609-1 (This news piece highlights findings from a TELUS Digital survey showing that many enterprise employees use public GenAI tools and sometimes even enter sensitive data.) 3. The Essential Guide to Generative AI – Virtualization Review Read more: https://virtualizationreview.com/Whitepapers/2025/02/SNOWFLAKE-The-Essential-Guide-to-Generative-AI.aspx (This guide provides insights into how GenAI is revolutionizing enterprise strategies and productivity, with input from industry leaders.) Feel free to click on the links to dive deeper into each story! """ ``` # Dependencies Pydantic AI uses a dependency injection system to provide data and services to your agent's [system prompts](../agents/#system-prompts), [tools](../tools/) and [output validators](../output/#output-validator-functions). Matching Pydantic AI's design philosophy, our dependency system tries to use existing best practice in Python development rather than inventing esoteric "magic", this should make dependencies type-safe, understandable easier to test and ultimately easier to deploy in production. ## Defining Dependencies Dependencies can be any python type. While in simple cases you might be able to pass a single object as a dependency (e.g. an HTTP connection), dataclasses are generally a convenient container when your dependencies included multiple objects. Here's an example of defining an agent that requires dependencies. (**Note:** dependencies aren't actually used in this example, see [Accessing Dependencies](#accessing-dependencies) below) unused_dependencies.py ```python from dataclasses import dataclass import httpx from pydantic_ai import Agent @dataclass class MyDeps: # (1)! api_key: str http_client: httpx.AsyncClient agent = Agent( 'openai:gpt-4o', deps_type=MyDeps, # (2)! ) async def main(): async with httpx.AsyncClient() as client: deps = MyDeps('foobar', client) result = await agent.run( 'Tell me a joke.', deps=deps, # (3)! ) print(result.output) #> Did you hear about the toothpaste scandal? They called it Colgate. ``` 1. Define a dataclass to hold dependencies. 1. Pass the dataclass type to the `deps_type` argument of the Agent constructor. **Note**: we're passing the type here, NOT an instance, this parameter is not actually used at runtime, it's here so we can get full type checking of the agent. 1. When running the agent, pass an instance of the dataclass to the `deps` parameter. *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* ## Accessing Dependencies Dependencies are accessed through the RunContext type, this should be the first parameter of system prompt functions etc. system_prompt_dependencies.py ```python from dataclasses import dataclass import httpx from pydantic_ai import Agent, RunContext @dataclass class MyDeps: api_key: str http_client: httpx.AsyncClient agent = Agent( 'openai:gpt-4o', deps_type=MyDeps, ) @agent.system_prompt # (1)! async def get_system_prompt(ctx: RunContext[MyDeps]) -> str: # (2)! response = await ctx.deps.http_client.get( # (3)! 'https://example.com', headers={'Authorization': f'Bearer {ctx.deps.api_key}'}, # (4)! ) response.raise_for_status() return f'Prompt: {response.text}' async def main(): async with httpx.AsyncClient() as client: deps = MyDeps('foobar', client) result = await agent.run('Tell me a joke.', deps=deps) print(result.output) #> Did you hear about the toothpaste scandal? They called it Colgate. ``` 1. RunContext may optionally be passed to a system_prompt function as the only argument. 1. RunContext is parameterized with the type of the dependencies, if this type is incorrect, static type checkers will raise an error. 1. Access dependencies through the .deps attribute. 1. Access dependencies through the .deps attribute. *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* ### Asynchronous vs. Synchronous dependencies [System prompt functions](../agents/#system-prompts), [function tools](../tools/) and [output validators](../output/#output-validator-functions) are all run in the async context of an agent run. If these functions are not coroutines (e.g. `async def`) they are called with run_in_executor in a thread pool, it's therefore marginally preferable to use `async` methods where dependencies perform IO, although synchronous dependencies should work fine too. `run` vs. `run_sync` and Asynchronous vs. Synchronous dependencies Whether you use synchronous or asynchronous dependencies, is completely independent of whether you use `run` or `run_sync` — `run_sync` is just a wrapper around `run` and agents are always run in an async context. Here's the same example as above, but with a synchronous dependency: sync_dependencies.py ```python from dataclasses import dataclass import httpx from pydantic_ai import Agent, RunContext @dataclass class MyDeps: api_key: str http_client: httpx.Client # (1)! agent = Agent( 'openai:gpt-4o', deps_type=MyDeps, ) @agent.system_prompt def get_system_prompt(ctx: RunContext[MyDeps]) -> str: # (2)! response = ctx.deps.http_client.get( 'https://example.com', headers={'Authorization': f'Bearer {ctx.deps.api_key}'} ) response.raise_for_status() return f'Prompt: {response.text}' async def main(): deps = MyDeps('foobar', httpx.Client()) result = await agent.run( 'Tell me a joke.', deps=deps, ) print(result.output) #> Did you hear about the toothpaste scandal? They called it Colgate. ``` 1. Here we use a synchronous `httpx.Client` instead of an asynchronous `httpx.AsyncClient`. 1. To match the synchronous dependency, the system prompt function is now a plain function, not a coroutine. *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* ## Full Example As well as system prompts, dependencies can be used in [tools](../tools/) and [output validators](../output/#output-validator-functions). full_example.py ```python from dataclasses import dataclass import httpx from pydantic_ai import Agent, ModelRetry, RunContext @dataclass class MyDeps: api_key: str http_client: httpx.AsyncClient agent = Agent( 'openai:gpt-4o', deps_type=MyDeps, ) @agent.system_prompt async def get_system_prompt(ctx: RunContext[MyDeps]) -> str: response = await ctx.deps.http_client.get('https://example.com') response.raise_for_status() return f'Prompt: {response.text}' @agent.tool # (1)! async def get_joke_material(ctx: RunContext[MyDeps], subject: str) -> str: response = await ctx.deps.http_client.get( 'https://example.com#jokes', params={'subject': subject}, headers={'Authorization': f'Bearer {ctx.deps.api_key}'}, ) response.raise_for_status() return response.text @agent.output_validator # (2)! async def validate_output(ctx: RunContext[MyDeps], output: str) -> str: response = await ctx.deps.http_client.post( 'https://example.com#validate', headers={'Authorization': f'Bearer {ctx.deps.api_key}'}, params={'query': output}, ) if response.status_code == 400: raise ModelRetry(f'invalid response: {response.text}') response.raise_for_status() return output async def main(): async with httpx.AsyncClient() as client: deps = MyDeps('foobar', client) result = await agent.run('Tell me a joke.', deps=deps) print(result.output) #> Did you hear about the toothpaste scandal? They called it Colgate. ``` 1. To pass `RunContext` to a tool, use the tool decorator. 1. `RunContext` may optionally be passed to a output_validator function as the first argument. *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* ## Overriding Dependencies When testing agents, it's useful to be able to customise dependencies. While this can sometimes be done by calling the agent directly within unit tests, we can also override dependencies while calling application code which in turn calls the agent. This is done via the override method on the agent. joke_app.py ```python from dataclasses import dataclass import httpx from pydantic_ai import Agent, RunContext @dataclass class MyDeps: api_key: str http_client: httpx.AsyncClient async def system_prompt_factory(self) -> str: # (1)! response = await self.http_client.get('https://example.com') response.raise_for_status() return f'Prompt: {response.text}' joke_agent = Agent('openai:gpt-4o', deps_type=MyDeps) @joke_agent.system_prompt async def get_system_prompt(ctx: RunContext[MyDeps]) -> str: return await ctx.deps.system_prompt_factory() # (2)! async def application_code(prompt: str) -> str: # (3)! ... ... # now deep within application code we call our agent async with httpx.AsyncClient() as client: app_deps = MyDeps('foobar', client) result = await joke_agent.run(prompt, deps=app_deps) # (4)! return result.output ``` 1. Define a method on the dependency to make the system prompt easier to customise. 1. Call the system prompt factory from within the system prompt function. 1. Application code that calls the agent, in a real application this might be an API endpoint. 1. Call the agent from within the application code, in a real application this call might be deep within a call stack. Note `app_deps` here will NOT be used when deps are overridden. *(This example is complete, it can be run "as is")* test_joke_app.py ```python from joke_app import MyDeps, application_code, joke_agent class TestMyDeps(MyDeps): # (1)! async def system_prompt_factory(self) -> str: return 'test prompt' async def test_application_code(): test_deps = TestMyDeps('test_key', None) # (2)! with joke_agent.override(deps=test_deps): # (3)! joke = await application_code('Tell me a joke.') # (4)! assert joke.startswith('Did you hear about the toothpaste scandal?') ``` 1. Define a subclass of `MyDeps` in tests to customise the system prompt factory. 1. Create an instance of the test dependency, we don't need to pass an `http_client` here as it's not used. 1. Override the dependencies of the agent for the duration of the `with` block, `test_deps` will be used when the agent is run. 1. Now we can safely call our application code, the agent will use the overridden dependencies. ## Examples The following examples demonstrate how to use dependencies in Pydantic AI: - [Weather Agent](../examples/weather-agent/) - [SQL Generation](../examples/sql-gen/) - [RAG](../examples/rag/) # Messages and chat history Pydantic AI provides access to messages exchanged during an agent run. These messages can be used both to continue a coherent conversation, and to understand how an agent performed. ### Accessing Messages from Results After running an agent, you can access the messages exchanged during that run from the `result` object. Both RunResult (returned by Agent.run, Agent.run_sync) and StreamedRunResult (returned by Agent.run_stream) have the following methods: - all_messages(): returns all messages, including messages from prior runs. There's also a variant that returns JSON bytes, all_messages_json(). - new_messages(): returns only the messages from the current run. There's also a variant that returns JSON bytes, new_messages_json(). StreamedRunResult and complete messages On StreamedRunResult, the messages returned from these methods will only include the final result message once the stream has finished. E.g. you've awaited one of the following coroutines: - StreamedRunResult.stream_output() - StreamedRunResult.stream_text() - StreamedRunResult.stream_responses() - StreamedRunResult.get_output() **Note:** The final result message will NOT be added to result messages if you use .stream_text(delta=True) since in this case the result content is never built as one string. Example of accessing methods on a RunResult : run_result_messages.py ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o', system_prompt='Be a helpful assistant.') result = agent.run_sync('Tell me a joke.') print(result.output) #> Did you hear about the toothpaste scandal? They called it Colgate. # all messages from the run print(result.all_messages()) """ [ ModelRequest( parts=[ SystemPromptPart( content='Be a helpful assistant.', timestamp=datetime.datetime(...), ), UserPromptPart( content='Tell me a joke.', timestamp=datetime.datetime(...), ), ] ), ModelResponse( parts=[ TextPart( content='Did you hear about the toothpaste scandal? They called it Colgate.' ) ], usage=RequestUsage(input_tokens=60, output_tokens=12), model_name='gpt-4o', timestamp=datetime.datetime(...), ), ] """ ``` *(This example is complete, it can be run "as is")* Example of accessing methods on a StreamedRunResult : streamed_run_result_messages.py ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o', system_prompt='Be a helpful assistant.') async def main(): async with agent.run_stream('Tell me a joke.') as result: # incomplete messages before the stream finishes print(result.all_messages()) """ [ ModelRequest( parts=[ SystemPromptPart( content='Be a helpful assistant.', timestamp=datetime.datetime(...), ), UserPromptPart( content='Tell me a joke.', timestamp=datetime.datetime(...), ), ] ) ] """ async for text in result.stream_text(): print(text) #> Did you hear #> Did you hear about the toothpaste #> Did you hear about the toothpaste scandal? They called #> Did you hear about the toothpaste scandal? They called it Colgate. # complete messages once the stream finishes print(result.all_messages()) """ [ ModelRequest( parts=[ SystemPromptPart( content='Be a helpful assistant.', timestamp=datetime.datetime(...), ), UserPromptPart( content='Tell me a joke.', timestamp=datetime.datetime(...), ), ] ), ModelResponse( parts=[ TextPart( content='Did you hear about the toothpaste scandal? They called it Colgate.' ) ], usage=RequestUsage(input_tokens=50, output_tokens=12), model_name='gpt-4o', timestamp=datetime.datetime(...), ), ] """ ``` *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* ### Using Messages as Input for Further Agent Runs The primary use of message histories in Pydantic AI is to maintain context across multiple agent runs. To use existing messages in a run, pass them to the `message_history` parameter of Agent.run, Agent.run_sync or Agent.run_stream. If `message_history` is set and not empty, a new system prompt is not generated — we assume the existing message history includes a system prompt. Reusing messages in a conversation ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o', system_prompt='Be a helpful assistant.') result1 = agent.run_sync('Tell me a joke.') print(result1.output) #> Did you hear about the toothpaste scandal? They called it Colgate. result2 = agent.run_sync('Explain?', message_history=result1.new_messages()) print(result2.output) #> This is an excellent joke invented by Samuel Colvin, it needs no explanation. print(result2.all_messages()) """ [ ModelRequest( parts=[ SystemPromptPart( content='Be a helpful assistant.', timestamp=datetime.datetime(...), ), UserPromptPart( content='Tell me a joke.', timestamp=datetime.datetime(...), ), ] ), ModelResponse( parts=[ TextPart( content='Did you hear about the toothpaste scandal? They called it Colgate.' ) ], usage=RequestUsage(input_tokens=60, output_tokens=12), model_name='gpt-4o', timestamp=datetime.datetime(...), ), ModelRequest( parts=[ UserPromptPart( content='Explain?', timestamp=datetime.datetime(...), ) ] ), ModelResponse( parts=[ TextPart( content='This is an excellent joke invented by Samuel Colvin, it needs no explanation.' ) ], usage=RequestUsage(input_tokens=61, output_tokens=26), model_name='gpt-4o', timestamp=datetime.datetime(...), ), ] """ ``` *(This example is complete, it can be run "as is")* ## Storing and loading messages (to JSON) While maintaining conversation state in memory is enough for many applications, often times you may want to store the messages history of an agent run on disk or in a database. This might be for evals, for sharing data between Python and JavaScript/TypeScript, or any number of other use cases. The intended way to do this is using a `TypeAdapter`. We export ModelMessagesTypeAdapter that can be used for this, or you can create your own. Here's an example showing how: serialize messages to json ```python from pydantic_core import to_jsonable_python from pydantic_ai import Agent from pydantic_ai.messages import ModelMessagesTypeAdapter # (1)! agent = Agent('openai:gpt-4o', system_prompt='Be a helpful assistant.') result1 = agent.run_sync('Tell me a joke.') history_step_1 = result1.all_messages() as_python_objects = to_jsonable_python(history_step_1) # (2)! same_history_as_step_1 = ModelMessagesTypeAdapter.validate_python(as_python_objects) result2 = agent.run_sync( # (3)! 'Tell me a different joke.', message_history=same_history_as_step_1 ) ``` 1. Alternatively, you can create a `TypeAdapter` from scratch: ```python from pydantic import TypeAdapter from pydantic_ai.messages import ModelMessage ModelMessagesTypeAdapter = TypeAdapter(list[ModelMessage]) ``` 1. Alternatively you can serialize to/from JSON directly: ```python from pydantic_core import to_json ... as_json_objects = to_json(history_step_1) same_history_as_step_1 = ModelMessagesTypeAdapter.validate_json(as_json_objects) ``` 1. You can now continue the conversation with history `same_history_as_step_1` despite creating a new agent run. *(This example is complete, it can be run "as is")* ## Other ways of using messages Since messages are defined by simple dataclasses, you can manually create and manipulate, e.g. for testing. The message format is independent of the model used, so you can use messages in different agents, or the same agent with different models. In the example below, we reuse the message from the first agent run, which uses the `openai:gpt-4o` model, in a second agent run using the `google-gla:gemini-1.5-pro` model. Reusing messages with a different model ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o', system_prompt='Be a helpful assistant.') result1 = agent.run_sync('Tell me a joke.') print(result1.output) #> Did you hear about the toothpaste scandal? They called it Colgate. result2 = agent.run_sync( 'Explain?', model='google-gla:gemini-1.5-pro', message_history=result1.new_messages(), ) print(result2.output) #> This is an excellent joke invented by Samuel Colvin, it needs no explanation. print(result2.all_messages()) """ [ ModelRequest( parts=[ SystemPromptPart( content='Be a helpful assistant.', timestamp=datetime.datetime(...), ), UserPromptPart( content='Tell me a joke.', timestamp=datetime.datetime(...), ), ] ), ModelResponse( parts=[ TextPart( content='Did you hear about the toothpaste scandal? They called it Colgate.' ) ], usage=RequestUsage(input_tokens=60, output_tokens=12), model_name='gpt-4o', timestamp=datetime.datetime(...), ), ModelRequest( parts=[ UserPromptPart( content='Explain?', timestamp=datetime.datetime(...), ) ] ), ModelResponse( parts=[ TextPart( content='This is an excellent joke invented by Samuel Colvin, it needs no explanation.' ) ], usage=RequestUsage(input_tokens=61, output_tokens=26), model_name='gemini-1.5-pro', timestamp=datetime.datetime(...), ), ] """ ``` ## Processing Message History Sometimes you may want to modify the message history before it's sent to the model. This could be for privacy reasons (filtering out sensitive information), to save costs on tokens, to give less context to the LLM, or custom processing logic. Pydantic AI provides a `history_processors` parameter on `Agent` that allows you to intercept and modify the message history before each model request. History processors replace the message history History processors replace the message history in the state with the processed messages, including the new user prompt part. This means that if you want to keep the original message history, you need to make a copy of it. ### Usage The `history_processors` is a list of callables that take a list of ModelMessage and return a modified list of the same type. Each processor is applied in sequence, and processors can be either synchronous or asynchronous. simple_history_processor.py ```python from pydantic_ai import Agent from pydantic_ai.messages import ( ModelMessage, ModelRequest, ModelResponse, TextPart, UserPromptPart, ) def filter_responses(messages: list[ModelMessage]) -> list[ModelMessage]: """Remove all ModelResponse messages, keeping only ModelRequest messages.""" return [msg for msg in messages if isinstance(msg, ModelRequest)] # Create agent with history processor agent = Agent('openai:gpt-4o', history_processors=[filter_responses]) # Example: Create some conversation history message_history = [ ModelRequest(parts=[UserPromptPart(content='What is 2+2?')]), ModelResponse(parts=[TextPart(content='2+2 equals 4')]), # This will be filtered out ] # When you run the agent, the history processor will filter out ModelResponse messages # result = agent.run_sync('What about 3+3?', message_history=message_history) ``` #### Keep Only Recent Messages You can use the `history_processor` to only keep the recent messages: keep_recent_messages.py ```python from pydantic_ai import Agent from pydantic_ai.messages import ModelMessage async def keep_recent_messages(messages: list[ModelMessage]) -> list[ModelMessage]: """Keep only the last 5 messages to manage token usage.""" return messages[-5:] if len(messages) > 5 else messages agent = Agent('openai:gpt-4o', history_processors=[keep_recent_messages]) # Example: Even with a long conversation history, only the last 5 messages are sent to the model long_conversation_history: list[ModelMessage] = [] # Your long conversation history here # result = agent.run_sync('What did we discuss?', message_history=long_conversation_history) ``` Be careful when slicing the message history When slicing the message history, you need to make sure that tool calls and returns are paired, otherwise the LLM may return an error. For more details, refer to [this GitHub issue](https://github.com/pydantic/pydantic-ai/issues/2050#issuecomment-3019976269). #### `RunContext` parameter History processors can optionally accept a RunContext parameter to access additional information about the current run, such as dependencies, model information, and usage statistics: context_aware_processor.py ```python from pydantic_ai import Agent, RunContext from pydantic_ai.messages import ModelMessage def context_aware_processor( ctx: RunContext[None], messages: list[ModelMessage], ) -> list[ModelMessage]: # Access current usage current_tokens = ctx.usage.total_tokens # Filter messages based on context if current_tokens > 1000: return messages[-3:] # Keep only recent messages when token usage is high return messages agent = Agent('openai:gpt-4o', history_processors=[context_aware_processor]) ``` This allows for more sophisticated message processing based on the current state of the agent run. #### Summarize Old Messages Use an LLM to summarize older messages to preserve context while reducing tokens. summarize_old_messages.py ```python from pydantic_ai import Agent from pydantic_ai.messages import ModelMessage # Use a cheaper model to summarize old messages. summarize_agent = Agent( 'openai:gpt-4o-mini', instructions=""" Summarize this conversation, omitting small talk and unrelated topics. Focus on the technical discussion and next steps. """, ) async def summarize_old_messages(messages: list[ModelMessage]) -> list[ModelMessage]: # Summarize the oldest 10 messages if len(messages) > 10: oldest_messages = messages[:10] summary = await summarize_agent.run(message_history=oldest_messages) # Return the last message and the summary return summary.new_messages() + messages[-1:] return messages agent = Agent('openai:gpt-4o', history_processors=[summarize_old_messages]) ``` Be careful when summarizing the message history When summarizing the message history, you need to make sure that tool calls and returns are paired, otherwise the LLM may return an error. For more details, refer to [this GitHub issue](https://github.com/pydantic/pydantic-ai/issues/2050#issuecomment-3019976269), where you can find examples of summarizing the message history. ### Testing History Processors You can test what messages are actually sent to the model provider using FunctionModel: test_history_processor.py ```python import pytest from pydantic_ai import Agent from pydantic_ai.messages import ( ModelMessage, ModelRequest, ModelResponse, TextPart, UserPromptPart, ) from pydantic_ai.models.function import AgentInfo, FunctionModel @pytest.fixture def received_messages() -> list[ModelMessage]: return [] @pytest.fixture def function_model(received_messages: list[ModelMessage]) -> FunctionModel: def capture_model_function(messages: list[ModelMessage], info: AgentInfo) -> ModelResponse: # Capture the messages that the provider actually receives received_messages.clear() received_messages.extend(messages) return ModelResponse(parts=[TextPart(content='Provider response')]) return FunctionModel(capture_model_function) def test_history_processor(function_model: FunctionModel, received_messages: list[ModelMessage]): def filter_responses(messages: list[ModelMessage]) -> list[ModelMessage]: return [msg for msg in messages if isinstance(msg, ModelRequest)] agent = Agent(function_model, history_processors=[filter_responses]) message_history = [ ModelRequest(parts=[UserPromptPart(content='Question 1')]), ModelResponse(parts=[TextPart(content='Answer 1')]), ] agent.run_sync('Question 2', message_history=message_history) assert received_messages == [ ModelRequest(parts=[UserPromptPart(content='Question 1')]), ModelRequest(parts=[UserPromptPart(content='Question 2')]), ] ``` ### Multiple Processors You can also use multiple processors: multiple_history_processors.py ```python from pydantic_ai import Agent from pydantic_ai.messages import ModelMessage, ModelRequest def filter_responses(messages: list[ModelMessage]) -> list[ModelMessage]: return [msg for msg in messages if isinstance(msg, ModelRequest)] def summarize_old_messages(messages: list[ModelMessage]) -> list[ModelMessage]: return messages[-5:] agent = Agent('openai:gpt-4o', history_processors=[filter_responses, summarize_old_messages]) ``` In this case, the `filter_responses` processor will be applied first, and the `summarize_old_messages` processor will be applied second. ## Examples For a more complete example of using messages in conversations, see the [chat app](../examples/chat-app/) example. # Multi-agent Applications There are roughly four levels of complexity when building applications with Pydantic AI: 1. Single agent workflows — what most of the `pydantic_ai` documentation covers 1. [Agent delegation](#agent-delegation) — agents using another agent via tools 1. [Programmatic agent hand-off](#programmatic-agent-hand-off) — one agent runs, then application code calls another agent 1. [Graph based control flow](../graph/) — for the most complex cases, a graph-based state machine can be used to control the execution of multiple agents Of course, you can combine multiple strategies in a single application. ## Agent delegation "Agent delegation" refers to the scenario where an agent delegates work to another agent, then takes back control when the delegate agent (the agent called from within a tool) finishes. If you want to hand off control to another agent completely, without coming back to the first agent, you can use an [output function](../output/#output-functions). Since agents are stateless and designed to be global, you do not need to include the agent itself in agent [dependencies](../dependencies/). You'll generally want to pass ctx.usage to the usage keyword argument of the delegate agent run so usage within that run counts towards the total usage of the parent agent run. Multiple models Agent delegation doesn't need to use the same model for each agent. If you choose to use different models within a run, calculating the monetary cost from the final result.usage() of the run will not be possible, but you can still use UsageLimits — including `request_limit`, `total_tokens_limit`, and `tool_calls_limit` — to avoid unexpected costs or runaway tool loops. agent_delegation_simple.py ```python from pydantic_ai import Agent, RunContext, UsageLimits joke_selection_agent = Agent( # (1)! 'openai:gpt-4o', system_prompt=( 'Use the `joke_factory` to generate some jokes, then choose the best. ' 'You must return just a single joke.' ), ) joke_generation_agent = Agent( # (2)! 'google-gla:gemini-1.5-flash', output_type=list[str] ) @joke_selection_agent.tool async def joke_factory(ctx: RunContext[None], count: int) -> list[str]: r = await joke_generation_agent.run( # (3)! f'Please generate {count} jokes.', usage=ctx.usage, # (4)! ) return r.output # (5)! result = joke_selection_agent.run_sync( 'Tell me a joke.', usage_limits=UsageLimits(request_limit=5, total_tokens_limit=500), ) print(result.output) #> Did you hear about the toothpaste scandal? They called it Colgate. print(result.usage()) #> RunUsage(input_tokens=204, output_tokens=24, requests=3, tool_calls=1) ``` 1. The "parent" or controlling agent. 1. The "delegate" agent, which is called from within a tool of the parent agent. 1. Call the delegate agent from within a tool of the parent agent. 1. Pass the usage from the parent agent to the delegate agent so the final result.usage() includes the usage from both agents. 1. Since the function returns `list[str]`, and the `output_type` of `joke_generation_agent` is also `list[str]`, we can simply return `r.output` from the tool. *(This example is complete, it can be run "as is")* The control flow for this example is pretty simple and can be summarised as follows: ``` graph TD START --> joke_selection_agent joke_selection_agent --> joke_factory["joke_factory (tool)"] joke_factory --> joke_generation_agent joke_generation_agent --> joke_factory joke_factory --> joke_selection_agent joke_selection_agent --> END ``` ### Agent delegation and dependencies Generally the delegate agent needs to either have the same [dependencies](../dependencies/) as the calling agent, or dependencies which are a subset of the calling agent's dependencies. Initializing dependencies We say "generally" above since there's nothing to stop you initializing dependencies within a tool call and therefore using interdependencies in a delegate agent that are not available on the parent, this should often be avoided since it can be significantly slower than reusing connections etc. from the parent agent. agent_delegation_deps.py ```python from dataclasses import dataclass import httpx from pydantic_ai import Agent, RunContext @dataclass class ClientAndKey: # (1)! http_client: httpx.AsyncClient api_key: str joke_selection_agent = Agent( 'openai:gpt-4o', deps_type=ClientAndKey, # (2)! system_prompt=( 'Use the `joke_factory` tool to generate some jokes on the given subject, ' 'then choose the best. You must return just a single joke.' ), ) joke_generation_agent = Agent( 'google-gla:gemini-1.5-flash', deps_type=ClientAndKey, # (4)! output_type=list[str], system_prompt=( 'Use the "get_jokes" tool to get some jokes on the given subject, ' 'then extract each joke into a list.' ), ) @joke_selection_agent.tool async def joke_factory(ctx: RunContext[ClientAndKey], count: int) -> list[str]: r = await joke_generation_agent.run( f'Please generate {count} jokes.', deps=ctx.deps, # (3)! usage=ctx.usage, ) return r.output @joke_generation_agent.tool # (5)! async def get_jokes(ctx: RunContext[ClientAndKey], count: int) -> str: response = await ctx.deps.http_client.get( 'https://example.com', params={'count': count}, headers={'Authorization': f'Bearer {ctx.deps.api_key}'}, ) response.raise_for_status() return response.text async def main(): async with httpx.AsyncClient() as client: deps = ClientAndKey(client, 'foobar') result = await joke_selection_agent.run('Tell me a joke.', deps=deps) print(result.output) #> Did you hear about the toothpaste scandal? They called it Colgate. print(result.usage()) # (6)! #> RunUsage(input_tokens=309, output_tokens=32, requests=4, tool_calls=2) ``` 1. Define a dataclass to hold the client and API key dependencies. 1. Set the `deps_type` of the calling agent — `joke_selection_agent` here. 1. Pass the dependencies to the delegate agent's run method within the tool call. 1. Also set the `deps_type` of the delegate agent — `joke_generation_agent` here. 1. Define a tool on the delegate agent that uses the dependencies to make an HTTP request. 1. Usage now includes 4 requests — 2 from the calling agent and 2 from the delegate agent. *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* This example shows how even a fairly simple agent delegation can lead to a complex control flow: ``` graph TD START --> joke_selection_agent joke_selection_agent --> joke_factory["joke_factory (tool)"] joke_factory --> joke_generation_agent joke_generation_agent --> get_jokes["get_jokes (tool)"] get_jokes --> http_request["HTTP request"] http_request --> get_jokes get_jokes --> joke_generation_agent joke_generation_agent --> joke_factory joke_factory --> joke_selection_agent joke_selection_agent --> END ``` ## Programmatic agent hand-off "Programmatic agent hand-off" refers to the scenario where multiple agents are called in succession, with application code and/or a human in the loop responsible for deciding which agent to call next. Here agents don't need to use the same deps. Here we show two agents used in succession, the first to find a flight and the second to extract the user's seat preference. programmatic_handoff.py ```python from typing import Literal from pydantic import BaseModel, Field from rich.prompt import Prompt from pydantic_ai import Agent, RunContext, RunUsage, UsageLimits from pydantic_ai.messages import ModelMessage class FlightDetails(BaseModel): flight_number: str class Failed(BaseModel): """Unable to find a satisfactory choice.""" flight_search_agent = Agent[None, FlightDetails | Failed]( # (1)! 'openai:gpt-4o', output_type=FlightDetails | Failed, # type: ignore system_prompt=( 'Use the "flight_search" tool to find a flight ' 'from the given origin to the given destination.' ), ) @flight_search_agent.tool # (2)! async def flight_search( ctx: RunContext[None], origin: str, destination: str ) -> FlightDetails | None: # in reality, this would call a flight search API or # use a browser to scrape a flight search website return FlightDetails(flight_number='AK456') usage_limits = UsageLimits(request_limit=15) # (3)! async def find_flight(usage: RunUsage) -> FlightDetails | None: # (4)! message_history: list[ModelMessage] | None = None for _ in range(3): prompt = Prompt.ask( 'Where would you like to fly from and to?', ) result = await flight_search_agent.run( prompt, message_history=message_history, usage=usage, usage_limits=usage_limits, ) if isinstance(result.output, FlightDetails): return result.output else: message_history = result.all_messages( output_tool_return_content='Please try again.' ) class SeatPreference(BaseModel): row: int = Field(ge=1, le=30) seat: Literal['A', 'B', 'C', 'D', 'E', 'F'] # This agent is responsible for extracting the user's seat selection seat_preference_agent = Agent[None, SeatPreference | Failed]( # (5)! 'openai:gpt-4o', output_type=SeatPreference | Failed, # type: ignore system_prompt=( "Extract the user's seat preference. " 'Seats A and F are window seats. ' 'Row 1 is the front row and has extra leg room. ' 'Rows 14, and 20 also have extra leg room. ' ), ) async def find_seat(usage: RunUsage) -> SeatPreference: # (6)! message_history: list[ModelMessage] | None = None while True: answer = Prompt.ask('What seat would you like?') result = await seat_preference_agent.run( answer, message_history=message_history, usage=usage, usage_limits=usage_limits, ) if isinstance(result.output, SeatPreference): return result.output else: print('Could not understand seat preference. Please try again.') message_history = result.all_messages() async def main(): # (7)! usage: RunUsage = RunUsage() opt_flight_details = await find_flight(usage) if opt_flight_details is not None: print(f'Flight found: {opt_flight_details.flight_number}') #> Flight found: AK456 seat_preference = await find_seat(usage) print(f'Seat preference: {seat_preference}') #> Seat preference: row=1 seat='A' ``` 1. Define the first agent, which finds a flight. We use an explicit type annotation until [PEP-747](https://peps.python.org/pep-0747/) lands, see [structured output](../output/#structured-output). We use a union as the output type so the model can communicate if it's unable to find a satisfactory choice; internally, each member of the union will be registered as a separate tool. 1. Define a tool on the agent to find a flight. In this simple case we could dispense with the tool and just define the agent to return structured data, then search for a flight, but in more complex scenarios the tool would be necessary. 1. Define usage limits for the entire app. 1. Define a function to find a flight, which asks the user for their preferences and then calls the agent to find a flight. 1. As with `flight_search_agent` above, we use an explicit type annotation to define the agent. 1. Define a function to find the user's seat preference, which asks the user for their seat preference and then calls the agent to extract the seat preference. 1. Now that we've put our logic for running each agent into separate functions, our main app becomes very simple. *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* The control flow for this example can be summarised as follows: ``` graph TB START --> ask_user_flight["ask user for flight"] subgraph find_flight flight_search_agent --> ask_user_flight ask_user_flight --> flight_search_agent end flight_search_agent --> ask_user_seat["ask user for seat"] flight_search_agent --> END subgraph find_seat seat_preference_agent --> ask_user_seat ask_user_seat --> seat_preference_agent end seat_preference_agent --> END ``` ## Pydantic Graphs See the [graph](../graph/) documentation on when and how to use graphs. ## Examples The following examples demonstrate how to use dependencies in Pydantic AI: - [Flight booking](../examples/flight-booking/) # Function Tools Function tools provide a mechanism for models to perform actions and retrieve extra information to help them generate a response. They're useful when you want to enable the model to take some action and use the result, when it is impractical or impossible to put all the context an agent might need into the instructions, or when you want to make agents' behavior more deterministic or reliable by deferring some of the logic required to generate a response to another (not necessarily AI-powered) tool. If you want a model to be able to call a function as its final action, without the result being sent back to the model, you can use an [output function](../output/#output-functions) instead. There are a number of ways to register tools with an agent: - via the @agent.tool decorator — for tools that need access to the agent context - via the @agent.tool_plain decorator — for tools that do not need access to the agent context - via the tools keyword argument to `Agent` which can take either plain functions, or instances of Tool For more advanced use cases, the [toolsets](../toolsets/) feature lets you manage collections of tools (built by you or provided by an [MCP server](../mcp/client/) or other [third party](../third-party-tools/#third-party-tools)) and register them with an agent in one go via the toolsets keyword argument to `Agent`. Internally, all `tools` and `toolsets` are gathered into a single [combined toolset](../toolsets/#combining-toolsets) that's made available to the model. Function tools vs. RAG Function tools are basically the "R" of RAG (Retrieval-Augmented Generation) — they augment what the model can do by letting it request extra information. The main semantic difference between Pydantic AI Tools and RAG is RAG is synonymous with vector search, while Pydantic AI tools are more general-purpose. (Note: we may add support for vector search functionality in the future, particularly an API for generating embeddings. See [#58](https://github.com/pydantic/pydantic-ai/issues/58)) Function Tools vs. Structured Outputs As the name suggests, function tools use the model's "tools" or "functions" API to let the model know what is available to call. Tools or functions are also used to define the schema(s) for [structured output](../output/) when using the default [tool output mode](../output/#tool-output), thus a model might have access to many tools, some of which call function tools while others end the run and produce a final output. ## Registering via Decorator `@agent.tool` is considered the default decorator since in the majority of cases tools will need access to the agent context. Here's an example using both: dice_game.py ```python import random from pydantic_ai import Agent, RunContext agent = Agent( 'google-gla:gemini-1.5-flash', # (1)! deps_type=str, # (2)! system_prompt=( "You're a dice game, you should roll the die and see if the number " "you get back matches the user's guess. If so, tell them they're a winner. " "Use the player's name in the response." ), ) @agent.tool_plain # (3)! def roll_dice() -> str: """Roll a six-sided die and return the result.""" return str(random.randint(1, 6)) @agent.tool # (4)! def get_player_name(ctx: RunContext[str]) -> str: """Get the player's name.""" return ctx.deps dice_result = agent.run_sync('My guess is 4', deps='Anne') # (5)! print(dice_result.output) #> Congratulations Anne, you guessed correctly! You're a winner! ``` 1. This is a pretty simple task, so we can use the fast and cheap Gemini flash model. 1. We pass the user's name as the dependency, to keep things simple we use just the name as a string as the dependency. 1. This tool doesn't need any context, it just returns a random number. You could probably use dynamic instructions in this case. 1. This tool needs the player's name, so it uses `RunContext` to access dependencies which are just the player's name in this case. 1. Run the agent, passing the player's name as the dependency. *(This example is complete, it can be run "as is")* Let's print the messages from that game to see what happened: dice_game_messages.py ```python from dice_game import dice_result print(dice_result.all_messages()) """ [ ModelRequest( parts=[ SystemPromptPart( content="You're a dice game, you should roll the die and see if the number you get back matches the user's guess. If so, tell them they're a winner. Use the player's name in the response.", timestamp=datetime.datetime(...), ), UserPromptPart( content='My guess is 4', timestamp=datetime.datetime(...), ), ] ), ModelResponse( parts=[ ToolCallPart( tool_name='roll_dice', args={}, tool_call_id='pyd_ai_tool_call_id' ) ], usage=RequestUsage(input_tokens=90, output_tokens=2), model_name='gemini-1.5-flash', timestamp=datetime.datetime(...), ), ModelRequest( parts=[ ToolReturnPart( tool_name='roll_dice', content='4', tool_call_id='pyd_ai_tool_call_id', timestamp=datetime.datetime(...), ) ] ), ModelResponse( parts=[ ToolCallPart( tool_name='get_player_name', args={}, tool_call_id='pyd_ai_tool_call_id' ) ], usage=RequestUsage(input_tokens=91, output_tokens=4), model_name='gemini-1.5-flash', timestamp=datetime.datetime(...), ), ModelRequest( parts=[ ToolReturnPart( tool_name='get_player_name', content='Anne', tool_call_id='pyd_ai_tool_call_id', timestamp=datetime.datetime(...), ) ] ), ModelResponse( parts=[ TextPart( content="Congratulations Anne, you guessed correctly! You're a winner!" ) ], usage=RequestUsage(input_tokens=92, output_tokens=12), model_name='gemini-1.5-flash', timestamp=datetime.datetime(...), ), ] """ ``` We can represent this with a diagram: ``` sequenceDiagram participant Agent participant LLM Note over Agent: Send prompts Agent ->> LLM: System: "You're a dice game..."
User: "My guess is 4" activate LLM Note over LLM: LLM decides to use
a tool LLM ->> Agent: Call tool
roll_dice() deactivate LLM activate Agent Note over Agent: Rolls a six-sided die Agent -->> LLM: ToolReturn
"4" deactivate Agent activate LLM Note over LLM: LLM decides to use
another tool LLM ->> Agent: Call tool
get_player_name() deactivate LLM activate Agent Note over Agent: Retrieves player name Agent -->> LLM: ToolReturn
"Anne" deactivate Agent activate LLM Note over LLM: LLM constructs final response LLM ->> Agent: ModelResponse
"Congratulations Anne, ..." deactivate LLM Note over Agent: Game session complete ``` ## Registering via Agent Argument As well as using the decorators, we can register tools via the `tools` argument to the Agent constructor. This is useful when you want to reuse tools, and can also give more fine-grained control over the tools. dice_game_tool_kwarg.py ```python import random from pydantic_ai import Agent, RunContext, Tool system_prompt = """\ You're a dice game, you should roll the die and see if the number you get back matches the user's guess. If so, tell them they're a winner. Use the player's name in the response. """ def roll_dice() -> str: """Roll a six-sided die and return the result.""" return str(random.randint(1, 6)) def get_player_name(ctx: RunContext[str]) -> str: """Get the player's name.""" return ctx.deps agent_a = Agent( 'google-gla:gemini-1.5-flash', deps_type=str, tools=[roll_dice, get_player_name], # (1)! system_prompt=system_prompt, ) agent_b = Agent( 'google-gla:gemini-1.5-flash', deps_type=str, tools=[ # (2)! Tool(roll_dice, takes_ctx=False), Tool(get_player_name, takes_ctx=True), ], system_prompt=system_prompt, ) dice_result = {} dice_result['a'] = agent_a.run_sync('My guess is 6', deps='Yashar') dice_result['b'] = agent_b.run_sync('My guess is 4', deps='Anne') print(dice_result['a'].output) #> Tough luck, Yashar, you rolled a 4. Better luck next time. print(dice_result['b'].output) #> Congratulations Anne, you guessed correctly! You're a winner! ``` 1. The simplest way to register tools via the `Agent` constructor is to pass a list of functions, the function signature is inspected to determine if the tool takes RunContext. 1. `agent_a` and `agent_b` are identical — but we can use Tool to reuse tool definitions and give more fine-grained control over how tools are defined, e.g. setting their name or description, or using a custom [`prepare`](../tools-advanced/#tool-prepare) method. *(This example is complete, it can be run "as is")* ## Tool Output Tools can return anything that Pydantic can serialize to JSON. For advanced output options including multi-modal content and metadata, see [Advanced Tool Features](../tools-advanced/#function-tool-output). ## Tool Schema Function parameters are extracted from the function signature, and all parameters except `RunContext` are used to build the schema for that tool call. Even better, Pydantic AI extracts the docstring from functions and (thanks to [griffe](https://mkdocstrings.github.io/griffe/)) extracts parameter descriptions from the docstring and adds them to the schema. [Griffe supports](https://mkdocstrings.github.io/griffe/reference/docstrings/#docstrings) extracting parameter descriptions from `google`, `numpy`, and `sphinx` style docstrings. Pydantic AI will infer the format to use based on the docstring, but you can explicitly set it using docstring_format. You can also enforce parameter requirements by setting `require_parameter_descriptions=True`. This will raise a UserError if a parameter description is missing. To demonstrate a tool's schema, here we use FunctionModel to print the schema a model would receive: tool_schema.py ```python from pydantic_ai import Agent from pydantic_ai.messages import ModelMessage, ModelResponse, TextPart from pydantic_ai.models.function import AgentInfo, FunctionModel agent = Agent() @agent.tool_plain(docstring_format='google', require_parameter_descriptions=True) def foobar(a: int, b: str, c: dict[str, list[float]]) -> str: """Get me foobar. Args: a: apple pie b: banana cake c: carrot smoothie """ return f'{a} {b} {c}' def print_schema(messages: list[ModelMessage], info: AgentInfo) -> ModelResponse: tool = info.function_tools[0] print(tool.description) #> Get me foobar. print(tool.parameters_json_schema) """ { 'additionalProperties': False, 'properties': { 'a': {'description': 'apple pie', 'type': 'integer'}, 'b': {'description': 'banana cake', 'type': 'string'}, 'c': { 'additionalProperties': {'items': {'type': 'number'}, 'type': 'array'}, 'description': 'carrot smoothie', 'type': 'object', }, }, 'required': ['a', 'b', 'c'], 'type': 'object', } """ return ModelResponse(parts=[TextPart('foobar')]) agent.run_sync('hello', model=FunctionModel(print_schema)) ``` *(This example is complete, it can be run "as is")* If a tool has a single parameter that can be represented as an object in JSON schema (e.g. dataclass, TypedDict, pydantic model), the schema for the tool is simplified to be just that object. Here's an example where we use TestModel.last_model_request_parameters to inspect the tool schema that would be passed to the model. single_parameter_tool.py ```python from pydantic import BaseModel from pydantic_ai import Agent from pydantic_ai.models.test import TestModel agent = Agent() class Foobar(BaseModel): """This is a Foobar""" x: int y: str z: float = 3.14 @agent.tool_plain def foobar(f: Foobar) -> str: return str(f) test_model = TestModel() result = agent.run_sync('hello', model=test_model) print(result.output) #> {"foobar":"x=0 y='a' z=3.14"} print(test_model.last_model_request_parameters.function_tools) """ [ ToolDefinition( name='foobar', parameters_json_schema={ 'properties': { 'x': {'type': 'integer'}, 'y': {'type': 'string'}, 'z': {'default': 3.14, 'type': 'number'}, }, 'required': ['x', 'y'], 'title': 'Foobar', 'type': 'object', }, description='This is a Foobar', ) ] """ ``` *(This example is complete, it can be run "as is")* ## See Also For more tool features and integrations, see: - [Advanced Tool Features](../tools-advanced/) - Custom schemas, dynamic tools, tool execution and retries - [Toolsets](../toolsets/) - Managing collections of tools - [Builtin Tools](../builtin-tools/) - Native tools provided by LLM providers - [Common Tools](../common-tools/) - Ready-to-use tool implementations - [Third-Party Tools](../third-party-tools/) - Integrations with MCP, LangChain, ACI.dev and other tool libraries - [Deferred Tools](../deferred-tools/) - Tools requiring approval or external execution # Models # Anthropic ## Install To use `AnthropicModel` models, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `anthropic` optional group: ```bash pip install "pydantic-ai-slim[anthropic]" ``` ```bash uv add "pydantic-ai-slim[anthropic]" ``` ## Configuration To use [Anthropic](https://anthropic.com) through their API, go to [console.anthropic.com/settings/keys](https://console.anthropic.com/settings/keys) to generate an API key. `AnthropicModelName` contains a list of available Anthropic models. ## Environment variable Once you have the API key, you can set it as an environment variable: ```bash export ANTHROPIC_API_KEY='your-api-key' ``` You can then use `AnthropicModel` by name: ```python from pydantic_ai import Agent agent = Agent('anthropic:claude-3-5-sonnet-latest') ... ``` Or initialise the model directly with just the model name: ```python from pydantic_ai import Agent from pydantic_ai.models.anthropic import AnthropicModel model = AnthropicModel('claude-3-5-sonnet-latest') agent = Agent(model) ... ``` ## `provider` argument You can provide a custom `Provider` via the `provider` argument: ```python from pydantic_ai import Agent from pydantic_ai.models.anthropic import AnthropicModel from pydantic_ai.providers.anthropic import AnthropicProvider model = AnthropicModel( 'claude-3-5-sonnet-latest', provider=AnthropicProvider(api_key='your-api-key') ) agent = Agent(model) ... ``` ## Custom HTTP Client You can customize the `AnthropicProvider` with a custom `httpx.AsyncClient`: ```python from httpx import AsyncClient from pydantic_ai import Agent from pydantic_ai.models.anthropic import AnthropicModel from pydantic_ai.providers.anthropic import AnthropicProvider custom_http_client = AsyncClient(timeout=30) model = AnthropicModel( 'claude-3-5-sonnet-latest', provider=AnthropicProvider(api_key='your-api-key', http_client=custom_http_client), ) agent = Agent(model) ... ``` # Bedrock ## Install To use `BedrockConverseModel`, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `bedrock` optional group: ```bash pip install "pydantic-ai-slim[bedrock]" ``` ```bash uv add "pydantic-ai-slim[bedrock]" ``` ## Configuration To use [AWS Bedrock](https://aws.amazon.com/bedrock/), you'll need an AWS account with Bedrock enabled and appropriate credentials. You can use either AWS credentials directly or a pre-configured boto3 client. `BedrockModelName` contains a list of available Bedrock models, including models from Anthropic, Amazon, Cohere, Meta, and Mistral. ## Environment variables You can set your AWS credentials as environment variables ([among other options](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html#using-environment-variables)): ```bash export AWS_BEARER_TOKEN_BEDROCK='your-api-key' # or: export AWS_ACCESS_KEY_ID='your-access-key' export AWS_SECRET_ACCESS_KEY='your-secret-key' export AWS_DEFAULT_REGION='us-east-1' # or your preferred region ``` You can then use `BedrockConverseModel` by name: ```python from pydantic_ai import Agent agent = Agent('bedrock:anthropic.claude-3-sonnet-20240229-v1:0') ... ``` Or initialize the model directly with just the model name: ```python from pydantic_ai import Agent from pydantic_ai.models.bedrock import BedrockConverseModel model = BedrockConverseModel('anthropic.claude-3-sonnet-20240229-v1:0') agent = Agent(model) ... ``` ## Customizing Bedrock Runtime API You can customize the Bedrock Runtime API calls by adding additional parameters, such as [guardrail configurations](https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails.html) and [performance settings](https://docs.aws.amazon.com/bedrock/latest/userguide/latency-optimized-inference.html). For a complete list of configurable parameters, refer to the documentation for BedrockModelSettings. customize_bedrock_model_settings.py ```python from pydantic_ai import Agent from pydantic_ai.models.bedrock import BedrockConverseModel, BedrockModelSettings # Define Bedrock model settings with guardrail and performance configurations bedrock_model_settings = BedrockModelSettings( bedrock_guardrail_config={ 'guardrailIdentifier': 'v1', 'guardrailVersion': 'v1', 'trace': 'enabled' }, bedrock_performance_configuration={ 'latency': 'optimized' } ) model = BedrockConverseModel(model_name='us.amazon.nova-pro-v1:0') agent = Agent(model=model, model_settings=bedrock_model_settings) ``` ## `provider` argument You can provide a custom `BedrockProvider` via the `provider` argument. This is useful when you want to specify credentials directly or use a custom boto3 client: ```python from pydantic_ai import Agent from pydantic_ai.models.bedrock import BedrockConverseModel from pydantic_ai.providers.bedrock import BedrockProvider # Using AWS credentials directly model = BedrockConverseModel( 'anthropic.claude-3-sonnet-20240229-v1:0', provider=BedrockProvider( region_name='us-east-1', aws_access_key_id='your-access-key', aws_secret_access_key='your-secret-key', ), ) agent = Agent(model) ... ``` You can also pass a pre-configured boto3 client: ```python import boto3 from pydantic_ai import Agent from pydantic_ai.models.bedrock import BedrockConverseModel from pydantic_ai.providers.bedrock import BedrockProvider # Using a pre-configured boto3 client bedrock_client = boto3.client('bedrock-runtime', region_name='us-east-1') model = BedrockConverseModel( 'anthropic.claude-3-sonnet-20240229-v1:0', provider=BedrockProvider(bedrock_client=bedrock_client), ) agent = Agent(model) ... ``` # Cohere ## Install To use `CohereModel`, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `cohere` optional group: ```bash pip install "pydantic-ai-slim[cohere]" ``` ```bash uv add "pydantic-ai-slim[cohere]" ``` ## Configuration To use [Cohere](https://cohere.com/) through their API, go to [dashboard.cohere.com/api-keys](https://dashboard.cohere.com/api-keys) and follow your nose until you find the place to generate an API key. `CohereModelName` contains a list of the most popular Cohere models. ## Environment variable Once you have the API key, you can set it as an environment variable: ```bash export CO_API_KEY='your-api-key' ``` You can then use `CohereModel` by name: ```python from pydantic_ai import Agent agent = Agent('cohere:command') ... ``` Or initialise the model directly with just the model name: ```python from pydantic_ai import Agent from pydantic_ai.models.cohere import CohereModel model = CohereModel('command') agent = Agent(model) ... ``` ## `provider` argument You can provide a custom `Provider` via the `provider` argument: ```python from pydantic_ai import Agent from pydantic_ai.models.cohere import CohereModel from pydantic_ai.providers.cohere import CohereProvider model = CohereModel('command', provider=CohereProvider(api_key='your-api-key')) agent = Agent(model) ... ``` You can also customize the `CohereProvider` with a custom `http_client`: ```python from httpx import AsyncClient from pydantic_ai import Agent from pydantic_ai.models.cohere import CohereModel from pydantic_ai.providers.cohere import CohereProvider custom_http_client = AsyncClient(timeout=30) model = CohereModel( 'command', provider=CohereProvider(api_key='your-api-key', http_client=custom_http_client), ) agent = Agent(model) ... ``` # Google The `GoogleModel` is a model that uses the [`google-genai`](https://pypi.org/project/google-genai/) package under the hood to access Google's Gemini models via both the Generative Language API and Vertex AI. ## Install To use `GoogleModel`, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `google` optional group: ```bash pip install "pydantic-ai-slim[google]" ``` ```bash uv add "pydantic-ai-slim[google]" ``` ______________________________________________________________________ ## Configuration `GoogleModel` lets you use Google's Gemini models through their [Generative Language API](https://ai.google.dev/api/all-methods) (`generativelanguage.googleapis.com`) or [Vertex AI API](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models) (`*-aiplatform.googleapis.com`). ### API Key (Generative Language API) To use Gemini via the Generative Language API, go to [aistudio.google.com](https://aistudio.google.com/apikey) and create an API key. Once you have the API key, set it as an environment variable: ```bash export GOOGLE_API_KEY=your-api-key ``` You can then use `GoogleModel` by explicitly creating a provider: ```python from pydantic_ai import Agent from pydantic_ai.models.google import GoogleModel from pydantic_ai.providers.google import GoogleProvider provider = GoogleProvider(api_key='your-api-key') model = GoogleModel('gemini-1.5-flash', provider=provider) agent = Agent(model) ... ``` ### Vertex AI (Enterprise/Cloud) If you are an enterprise user, you can use the `google-vertex` provider with `GoogleModel` to access Gemini via Vertex AI. This interface has a number of advantages over the Generative Language API: 1. The VertexAI API comes with more enterprise readiness guarantees. 1. You can [purchase provisioned throughput](https://cloud.google.com/vertex-ai/generative-ai/docs/provisioned-throughput#purchase-provisioned-throughput) with VertexAI to guarantee capacity. 1. If you're running Pydantic AI inside GCP, you don't need to set up authentication, it should "just work". 1. You can decide which region to use, which might be important from a regulatory perspective, and might improve latency. The big disadvantage is that for local development you may need to create and configure a "service account", which can be challenging to get right. Whichever way you authenticate, you'll need to have VertexAI enabled in your GCP account. To use Vertex AI, you may need to set up [application default credentials](https://cloud.google.com/docs/authentication/application-default-credentials) or use a service account. You can also specify the region. #### Application Default Credentials If you have the [`gcloud` CLI](https://cloud.google.com/sdk/gcloud) installed and configured, you can use: ```python from pydantic_ai import Agent from pydantic_ai.models.google import GoogleModel from pydantic_ai.providers.google import GoogleProvider provider = GoogleProvider(vertexai=True) model = GoogleModel('gemini-1.5-flash', provider=provider) agent = Agent(model) ... ``` #### Service Account To use a service account JSON file: google_model_service_account.py ```python from google.oauth2 import service_account from pydantic_ai import Agent from pydantic_ai.models.google import GoogleModel from pydantic_ai.providers.google import GoogleProvider credentials = service_account.Credentials.from_service_account_file( 'path/to/service-account.json', scopes=['https://www.googleapis.com/auth/cloud-platform'], ) provider = GoogleProvider(credentials=credentials, project='your-project-id') model = GoogleModel('gemini-1.5-flash', provider=provider) agent = Agent(model) ... ``` #### Customizing Location You can specify the location when using Vertex AI: google_model_location.py ```python from pydantic_ai import Agent from pydantic_ai.models.google import GoogleModel from pydantic_ai.providers.google import GoogleProvider provider = GoogleProvider(vertexai=True, location='asia-east1') model = GoogleModel('gemini-1.5-flash', provider=provider) agent = Agent(model) ... ``` ## Provider Argument You can supply a custom `GoogleProvider` instance using the `provider` argument to configure advanced client options, such as setting a custom `base_url`. This is useful if you're using a custom-compatible endpoint with the Google Generative Language API. ```python from google.genai import Client from google.genai.types import HttpOptions from pydantic_ai import Agent from pydantic_ai.models.google import GoogleModel from pydantic_ai.providers.google import GoogleProvider client = Client( api_key='gemini-custom-api-key', http_options=HttpOptions(base_url='gemini-custom-base-url'), ) provider = GoogleProvider(client=client) model = GoogleModel('gemini-1.5-flash', provider=provider) agent = Agent(model) ... ``` ## Model Settings You can customize model behavior using GoogleModelSettings: ```python from google.genai.types import HarmBlockThreshold, HarmCategory from pydantic_ai import Agent from pydantic_ai.models.google import GoogleModel, GoogleModelSettings settings = GoogleModelSettings( temperature=0.2, max_tokens=1024, google_thinking_config={'thinking_budget': 2048}, google_safety_settings=[ { 'category': HarmCategory.HARM_CATEGORY_HATE_SPEECH, 'threshold': HarmBlockThreshold.BLOCK_LOW_AND_ABOVE, } ] ) model = GoogleModel('gemini-1.5-flash') agent = Agent(model, model_settings=settings) ... ``` See the [Gemini API docs](https://ai.google.dev/gemini-api/docs/safety-settings) for more on safety settings, and [thinking config](https://ai.google.dev/gemini-api/docs/thinking). ## Document, Image, Audio, and Video Input `GoogleModel` supports multi-modal input, including documents, images, audio, and video. See the [input documentation](../../input/) for details and examples. ## Model settings You can use the GoogleModelSettings class to customize the model request. ### Disable thinking You can disable thinking by setting the `thinking_budget` to `0` on the `google_thinking_config`: ```python from pydantic_ai import Agent from pydantic_ai.models.google import GoogleModel, GoogleModelSettings model_settings = GoogleModelSettings(google_thinking_config={'thinking_budget': 0}) model = GoogleModel('gemini-2.0-flash') agent = Agent(model, model_settings=model_settings) ... ``` Check out the [Gemini API docs](https://ai.google.dev/gemini-api/docs/thinking) for more on thinking. ### Safety settings You can customize the safety settings by setting the `google_safety_settings` field. ```python from google.genai.types import HarmBlockThreshold, HarmCategory from pydantic_ai import Agent from pydantic_ai.models.google import GoogleModel, GoogleModelSettings model_settings = GoogleModelSettings( google_safety_settings=[ { 'category': HarmCategory.HARM_CATEGORY_HATE_SPEECH, 'threshold': HarmBlockThreshold.BLOCK_LOW_AND_ABOVE, } ] ) model = GoogleModel('gemini-2.0-flash') agent = Agent(model, model_settings=model_settings) ... ``` See the [Gemini API docs](https://ai.google.dev/gemini-api/docs/safety-settings) for more on safety settings. # Groq ## Install To use `GroqModel`, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `groq` optional group: ```bash pip install "pydantic-ai-slim[groq]" ``` ```bash uv add "pydantic-ai-slim[groq]" ``` ## Configuration To use [Groq](https://groq.com/) through their API, go to [console.groq.com/keys](https://console.groq.com/keys) and follow your nose until you find the place to generate an API key. `GroqModelName` contains a list of available Groq models. ## Environment variable Once you have the API key, you can set it as an environment variable: ```bash export GROQ_API_KEY='your-api-key' ``` You can then use `GroqModel` by name: ```python from pydantic_ai import Agent agent = Agent('groq:llama-3.3-70b-versatile') ... ``` Or initialise the model directly with just the model name: ```python from pydantic_ai import Agent from pydantic_ai.models.groq import GroqModel model = GroqModel('llama-3.3-70b-versatile') agent = Agent(model) ... ``` ## `provider` argument You can provide a custom `Provider` via the `provider` argument: ```python from pydantic_ai import Agent from pydantic_ai.models.groq import GroqModel from pydantic_ai.providers.groq import GroqProvider model = GroqModel( 'llama-3.3-70b-versatile', provider=GroqProvider(api_key='your-api-key') ) agent = Agent(model) ... ``` You can also customize the `GroqProvider` with a custom `httpx.AsyncHTTPClient`: ```python from httpx import AsyncClient from pydantic_ai import Agent from pydantic_ai.models.groq import GroqModel from pydantic_ai.providers.groq import GroqProvider custom_http_client = AsyncClient(timeout=30) model = GroqModel( 'llama-3.3-70b-versatile', provider=GroqProvider(api_key='your-api-key', http_client=custom_http_client), ) agent = Agent(model) ... ``` # Hugging Face [Hugging Face](https://huggingface.co/) is an AI platform with all major open source models, datasets, MCPs, and demos. You can use [Inference Providers](https://huggingface.co/docs/inference-providers) to run open source models like DeepSeek R1 on scalable serverless infrastructure. ## Install To use `HuggingFaceModel`, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `huggingface` optional group: ```bash pip install "pydantic-ai-slim[huggingface]" ``` ```bash uv add "pydantic-ai-slim[huggingface]" ``` ## Configuration To use [Hugging Face](https://huggingface.co/) inference, you'll need to set up an account which will give you [free tier](https://huggingface.co/docs/inference-providers/pricing) allowance on [Inference Providers](https://huggingface.co/docs/inference-providers). To setup inference, follow these steps: 1. Go to [Hugging Face](https://huggingface.co/join) and sign up for an account. 1. Create a new access token in [Hugging Face](https://huggingface.co/settings/tokens). 1. Set the `HF_TOKEN` environment variable to the token you just created. Once you have a Hugging Face access token, you can set it as an environment variable: ```bash export HF_TOKEN='hf_token' ``` ## Usage You can then use HuggingFaceModel by name: ```python from pydantic_ai import Agent agent = Agent('huggingface:Qwen/Qwen3-235B-A22B') ... ``` Or initialise the model directly with just the model name: ```python from pydantic_ai import Agent from pydantic_ai.models.huggingface import HuggingFaceModel model = HuggingFaceModel('Qwen/Qwen3-235B-A22B') agent = Agent(model) ... ``` By default, the HuggingFaceModel uses the HuggingFaceProvider that will select automatically the first of the inference providers (Cerebras, Together AI, Cohere..etc) available for the model, sorted by your preferred order in https://hf.co/settings/inference-providers. ## Configure the provider If you want to pass parameters in code to the provider, you can programmatically instantiate the HuggingFaceProvider and pass it to the model: ```python from pydantic_ai import Agent from pydantic_ai.models.huggingface import HuggingFaceModel from pydantic_ai.providers.huggingface import HuggingFaceProvider model = HuggingFaceModel('Qwen/Qwen3-235B-A22B', provider=HuggingFaceProvider(api_key='hf_token', provider_name='nebius')) agent = Agent(model) ... ``` ## Custom Hugging Face client HuggingFaceProvider also accepts a custom [`AsyncInferenceClient`](https://huggingface.co/docs/huggingface_hub/v0.29.3/en/package_reference/inference_client#huggingface_hub.AsyncInferenceClient) client via the `hf_client` parameter, so you can customise the `headers`, `bill_to` (billing to an HF organization you're a member of), `base_url` etc. as defined in the [Hugging Face Hub python library docs](https://huggingface.co/docs/huggingface_hub/package_reference/inference_client). ```python from huggingface_hub import AsyncInferenceClient from pydantic_ai import Agent from pydantic_ai.models.huggingface import HuggingFaceModel from pydantic_ai.providers.huggingface import HuggingFaceProvider client = AsyncInferenceClient( bill_to='openai', api_key='hf_token', provider='fireworks-ai', ) model = HuggingFaceModel( 'Qwen/Qwen3-235B-A22B', provider=HuggingFaceProvider(hf_client=client), ) agent = Agent(model) ... ``` # Mistral ## Install To use `MistralModel`, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `mistral` optional group: ```bash pip install "pydantic-ai-slim[mistral]" ``` ```bash uv add "pydantic-ai-slim[mistral]" ``` ## Configuration To use [Mistral](https://mistral.ai) through their API, go to [console.mistral.ai/api-keys/](https://console.mistral.ai/api-keys/) and follow your nose until you find the place to generate an API key. `LatestMistralModelNames` contains a list of the most popular Mistral models. ## Environment variable Once you have the API key, you can set it as an environment variable: ```bash export MISTRAL_API_KEY='your-api-key' ``` You can then use `MistralModel` by name: ```python from pydantic_ai import Agent agent = Agent('mistral:mistral-large-latest') ... ``` Or initialise the model directly with just the model name: ```python from pydantic_ai import Agent from pydantic_ai.models.mistral import MistralModel model = MistralModel('mistral-small-latest') agent = Agent(model) ... ``` ## `provider` argument You can provide a custom `Provider` via the `provider` argument: ```python from pydantic_ai import Agent from pydantic_ai.models.mistral import MistralModel from pydantic_ai.providers.mistral import MistralProvider model = MistralModel( 'mistral-large-latest', provider=MistralProvider(api_key='your-api-key', base_url='https://') ) agent = Agent(model) ... ``` You can also customize the provider with a custom `httpx.AsyncHTTPClient`: ```python from httpx import AsyncClient from pydantic_ai import Agent from pydantic_ai.models.mistral import MistralModel from pydantic_ai.providers.mistral import MistralProvider custom_http_client = AsyncClient(timeout=30) model = MistralModel( 'mistral-large-latest', provider=MistralProvider(api_key='your-api-key', http_client=custom_http_client), ) agent = Agent(model) ... ``` # OpenAI ## Install To use OpenAI models or OpenAI-compatible APIs, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `openai` optional group: ```bash pip install "pydantic-ai-slim[openai]" ``` ```bash uv add "pydantic-ai-slim[openai]" ``` ## Configuration To use `OpenAIChatModel` with the OpenAI API, go to [platform.openai.com](https://platform.openai.com/) and follow your nose until you find the place to generate an API key. ## Environment variable Once you have the API key, you can set it as an environment variable: ```bash export OPENAI_API_KEY='your-api-key' ``` You can then use `OpenAIChatModel` by name: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') ... ``` Or initialise the model directly with just the model name: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel model = OpenAIChatModel('gpt-4o') agent = Agent(model) ... ``` By default, the `OpenAIChatModel` uses the `OpenAIProvider` with the `base_url` set to `https://api.openai.com/v1`. ## Configure the provider If you want to pass parameters in code to the provider, you can programmatically instantiate the OpenAIProvider and pass it to the model: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.openai import OpenAIProvider model = OpenAIChatModel('gpt-4o', provider=OpenAIProvider(api_key='your-api-key')) agent = Agent(model) ... ``` ## Custom OpenAI Client `OpenAIProvider` also accepts a custom `AsyncOpenAI` client via the `openai_client` parameter, so you can customise the `organization`, `project`, `base_url` etc. as defined in the [OpenAI API docs](https://platform.openai.com/docs/api-reference). custom_openai_client.py ```python from openai import AsyncOpenAI from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.openai import OpenAIProvider client = AsyncOpenAI(max_retries=3) model = OpenAIChatModel('gpt-4o', provider=OpenAIProvider(openai_client=client)) agent = Agent(model) ... ``` You could also use the [`AsyncAzureOpenAI`](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/switching-endpoints) client to use the Azure OpenAI API. Note that the `AsyncAzureOpenAI` is a subclass of `AsyncOpenAI`. ```python from openai import AsyncAzureOpenAI from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.openai import OpenAIProvider client = AsyncAzureOpenAI( azure_endpoint='...', api_version='2024-07-01-preview', api_key='your-api-key', ) model = OpenAIChatModel( 'gpt-4o', provider=OpenAIProvider(openai_client=client), ) agent = Agent(model) ... ``` ## OpenAI Responses API Pydantic AI also supports OpenAI's [Responses API](https://platform.openai.com/docs/api-reference/responses) through the OpenAIResponsesModel class. ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIResponsesModel model = OpenAIResponsesModel('gpt-4o') agent = Agent(model) ... ``` The Responses API has built-in tools that you can use instead of building your own: - [Web search](https://platform.openai.com/docs/guides/tools-web-search): allow models to search the web for the latest information before generating a response. - [File search](https://platform.openai.com/docs/guides/tools-file-search): allow models to search your files for relevant information before generating a response. - [Computer use](https://platform.openai.com/docs/guides/tools-computer-use): allow models to use a computer to perform tasks on your behalf. You can use the `OpenAIResponsesModelSettings` class to make use of those built-in tools: ```python from openai.types.responses import WebSearchToolParam # (1)! from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIResponsesModel, OpenAIResponsesModelSettings model_settings = OpenAIResponsesModelSettings( openai_builtin_tools=[WebSearchToolParam(type='web_search_preview')], ) model = OpenAIResponsesModel('gpt-4o') agent = Agent(model=model, model_settings=model_settings) result = agent.run_sync('What is the weather in Tokyo?') print(result.output) """ As of 7:48 AM on Wednesday, April 2, 2025, in Tokyo, Japan, the weather is cloudy with a temperature of 53°F (12°C). """ ``` 1. The file search tool and computer use tool can also be imported from `openai.types.responses`. You can learn more about the differences between the Responses API and Chat Completions API in the [OpenAI API docs](https://platform.openai.com/docs/guides/responses-vs-chat-completions). #### Referencing earlier responses The Responses API supports referencing earlier model responses in a new request using a `previous_response_id` parameter, to ensure the full [conversation state](https://platform.openai.com/docs/guides/conversation-state?api-mode=responses#passing-context-from-the-previous-response) including [reasoning items](https://platform.openai.com/docs/guides/reasoning#keeping-reasoning-items-in-context) are kept in context. This is available through the `openai_previous_response_id` field in OpenAIResponsesModelSettings. ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIResponsesModel, OpenAIResponsesModelSettings model = OpenAIResponsesModel('gpt-5') agent = Agent(model=model) result = agent.run_sync('The secret is 1234') model_settings = OpenAIResponsesModelSettings( openai_previous_response_id=result.all_messages()[-1].provider_response_id ) result = agent.run_sync('What is the secret code?', model_settings=model_settings) print(result.output) #> 1234 ``` By passing the `provider_response_id` from an earlier run, you can allow the model to build on its own prior reasoning without needing to resend the full message history. ##### Automatically referencing earlier responses When the `openai_previous_response_id` field is set to `'auto'`, Pydantic AI will automatically select the most recent `provider_response_id` from message history and omit messages that came before it, letting the OpenAI API leverage server-side history instead for improved efficiency. ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIResponsesModel, OpenAIResponsesModelSettings model = OpenAIResponsesModel('gpt-5') agent = Agent(model=model) result1 = agent.run_sync('Tell me a joke.') print(result1.output) #> Did you hear about the toothpaste scandal? They called it Colgate. # When set to 'auto', the most recent provider_response_id # and messages after it are sent as request. model_settings = OpenAIResponsesModelSettings(openai_previous_response_id='auto') result2 = agent.run_sync( 'Explain?', message_history=result1.new_messages(), model_settings=model_settings ) print(result2.output) #> This is an excellent joke invented by Samuel Colvin, it needs no explanation. ``` ## OpenAI-compatible Models Many providers and models are compatible with the OpenAI API, and can be used with `OpenAIChatModel` in Pydantic AI. Before getting started, check the [installation and configuration](#install) instructions above. To use another OpenAI-compatible API, you can make use of the `base_url` and `api_key` arguments from `OpenAIProvider`: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.openai import OpenAIProvider model = OpenAIChatModel( 'model_name', provider=OpenAIProvider( base_url='https://.com', api_key='your-api-key' ), ) agent = Agent(model) ... ``` Various providers also have their own provider classes so that you don't need to specify the base URL yourself and you can use the standard `_API_KEY` environment variable to set the API key. When a provider has its own provider class, you can use the `Agent(":")` shorthand, e.g. `Agent("deepseek:deepseek-chat")` or `Agent("openrouter:google/gemini-2.5-pro-preview")`, instead of building the `OpenAIChatModel` explicitly. Similarly, you can pass the provider name as a string to the `provider` argument on `OpenAIChatModel` instead of building instantiating the provider class explicitly. #### Model Profile Sometimes, the provider or model you're using will have slightly different requirements than OpenAI's API or models, like having different restrictions on JSON schemas for tool definitions, or not supporting tool definitions to be marked as strict. When using an alternative provider class provided by Pydantic AI, an appropriate model profile is typically selected automatically based on the model name. If the model you're using is not working correctly out of the box, you can tweak various aspects of how model requests are constructed by providing your own ModelProfile (for behaviors shared among all model classes) or OpenAIModelProfile (for behaviors specific to `OpenAIChatModel`): ```py from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.profiles import InlineDefsJsonSchemaTransformer from pydantic_ai.profiles.openai import OpenAIModelProfile from pydantic_ai.providers.openai import OpenAIProvider model = OpenAIChatModel( 'model_name', provider=OpenAIProvider( base_url='https://.com', api_key='your-api-key' ), profile=OpenAIModelProfile( json_schema_transformer=InlineDefsJsonSchemaTransformer, # Supported by any model class on a plain ModelProfile openai_supports_strict_tool_definition=False # Supported by OpenAIModel only, requires OpenAIModelProfile ) ) agent = Agent(model) ``` ### DeepSeek To use the [DeepSeek](https://deepseek.com) provider, first create an API key by following the [Quick Start guide](https://api-docs.deepseek.com/). Once you have the API key, you can use it with the DeepSeekProvider: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.deepseek import DeepSeekProvider model = OpenAIChatModel( 'deepseek-chat', provider=DeepSeekProvider(api_key='your-deepseek-api-key'), ) agent = Agent(model) ... ``` You can also customize any provider with a custom `http_client`: ```python from httpx import AsyncClient from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.deepseek import DeepSeekProvider custom_http_client = AsyncClient(timeout=30) model = OpenAIChatModel( 'deepseek-chat', provider=DeepSeekProvider( api_key='your-deepseek-api-key', http_client=custom_http_client ), ) agent = Agent(model) ... ``` ### Ollama To use [Ollama](https://ollama.com/), you must first download the Ollama client, and then download a model using the [Ollama model library](https://ollama.com/library). You must also ensure the Ollama server is running when trying to make requests to it. For more information, please see the [Ollama documentation](https://github.com/ollama/ollama/tree/main/docs). You can then use the model with the OllamaProvider. #### Example local usage With `ollama` installed, you can run the server with the model you want to use: ```bash ollama run llama3.2 ``` (this will pull the `llama3.2` model if you don't already have it downloaded) Then run your code, here's a minimal example: ```python from pydantic import BaseModel from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.ollama import OllamaProvider class CityLocation(BaseModel): city: str country: str ollama_model = OpenAIChatModel( model_name='llama3.2', provider=OllamaProvider(base_url='http://localhost:11434/v1'), ) agent = Agent(ollama_model, output_type=CityLocation) result = agent.run_sync('Where were the olympics held in 2012?') print(result.output) #> city='London' country='United Kingdom' print(result.usage()) #> RunUsage(input_tokens=57, output_tokens=8, requests=1) ``` #### Example using a remote server ```python from pydantic import BaseModel from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.ollama import OllamaProvider ollama_model = OpenAIChatModel( model_name='qwen2.5-coder:7b', # (1)! provider=OllamaProvider(base_url='http://192.168.1.74:11434/v1'), # (2)! ) class CityLocation(BaseModel): city: str country: str agent = Agent(model=ollama_model, output_type=CityLocation) result = agent.run_sync('Where were the olympics held in 2012?') print(result.output) #> city='London' country='United Kingdom' print(result.usage()) #> RunUsage(input_tokens=57, output_tokens=8, requests=1) ``` 1. The name of the model running on the remote server 1. The url of the remote server ### Azure AI Foundry If you want to use [Azure AI Foundry](https://ai.azure.com/) as your provider, you can do so by using the AzureProvider class. ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.azure import AzureProvider model = OpenAIChatModel( 'gpt-4o', provider=AzureProvider( azure_endpoint='your-azure-endpoint', api_version='your-api-version', api_key='your-api-key', ), ) agent = Agent(model) ... ``` ### OpenRouter To use [OpenRouter](https://openrouter.ai), first create an API key at [openrouter.ai/keys](https://openrouter.ai/keys). Once you have the API key, you can use it with the OpenRouterProvider: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.openrouter import OpenRouterProvider model = OpenAIChatModel( 'anthropic/claude-3.5-sonnet', provider=OpenRouterProvider(api_key='your-openrouter-api-key'), ) agent = Agent(model) ... ``` ### Vercel AI Gateway To use [Vercel's AI Gateway](https://vercel.com/docs/ai-gateway), first follow the [documentation](https://vercel.com/docs/ai-gateway) instructions on obtaining an API key or OIDC token. You can set your credentials using one of these environment variables: ```bash export VERCEL_AI_GATEWAY_API_KEY='your-ai-gateway-api-key' # OR export VERCEL_OIDC_TOKEN='your-oidc-token' ``` Once you have set the environment variable, you can use it with the VercelProvider: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.vercel import VercelProvider # Uses environment variable automatically model = OpenAIChatModel( 'anthropic/claude-4-sonnet', provider=VercelProvider(), ) agent = Agent(model) # Or pass the API key directly model = OpenAIChatModel( 'anthropic/claude-4-sonnet', provider=VercelProvider(api_key='your-vercel-ai-gateway-api-key'), ) agent = Agent(model) ... ``` ### Grok (xAI) Go to [xAI API Console](https://console.x.ai/) and create an API key. Once you have the API key, you can use it with the GrokProvider: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.grok import GrokProvider model = OpenAIChatModel( 'grok-2-1212', provider=GrokProvider(api_key='your-xai-api-key'), ) agent = Agent(model) ... ``` ### MoonshotAI Create an API key in the [Moonshot Console](https://platform.moonshot.ai/console). With that key you can instantiate the MoonshotAIProvider: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.moonshotai import MoonshotAIProvider model = OpenAIChatModel( 'kimi-k2-0711-preview', provider=MoonshotAIProvider(api_key='your-moonshot-api-key'), ) agent = Agent(model) ... ``` ### GitHub Models To use [GitHub Models](https://docs.github.com/en/github-models), you'll need a GitHub personal access token with the `models: read` permission. Once you have the token, you can use it with the GitHubProvider: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.github import GitHubProvider model = OpenAIChatModel( 'xai/grok-3-mini', # GitHub Models uses prefixed model names provider=GitHubProvider(api_key='your-github-token'), ) agent = Agent(model) ... ``` You can also set the `GITHUB_API_KEY` environment variable: ```bash export GITHUB_API_KEY='your-github-token' ``` GitHub Models supports various model families with different prefixes. You can see the full list on the [GitHub Marketplace](https://github.com/marketplace?type=models) or the public [catalog endpoint](https://models.github.ai/catalog/models). ### Perplexity Follow the Perplexity [getting started](https://docs.perplexity.ai/guides/getting-started) guide to create an API key. Then, you can query the Perplexity API with the following: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.openai import OpenAIProvider model = OpenAIChatModel( 'sonar-pro', provider=OpenAIProvider( base_url='https://api.perplexity.ai', api_key='your-perplexity-api-key', ), ) agent = Agent(model) ... ``` ### Fireworks AI Go to [Fireworks.AI](https://fireworks.ai/) and create an API key in your account settings. Once you have the API key, you can use it with the FireworksProvider: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.fireworks import FireworksProvider model = OpenAIChatModel( 'accounts/fireworks/models/qwq-32b', # model library available at https://fireworks.ai/models provider=FireworksProvider(api_key='your-fireworks-api-key'), ) agent = Agent(model) ... ``` ### Together AI Go to [Together.ai](https://www.together.ai/) and create an API key in your account settings. Once you have the API key, you can use it with the TogetherProvider: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.together import TogetherProvider model = OpenAIChatModel( 'meta-llama/Llama-3.3-70B-Instruct-Turbo-Free', # model library available at https://www.together.ai/models provider=TogetherProvider(api_key='your-together-api-key'), ) agent = Agent(model) ... ``` ### Heroku AI To use [Heroku AI](https://www.heroku.com/ai), you can use the HerokuProvider: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.heroku import HerokuProvider model = OpenAIChatModel( 'claude-3-7-sonnet', provider=HerokuProvider(api_key='your-heroku-inference-key'), ) agent = Agent(model) ... ``` You can set the `HEROKU_INFERENCE_KEY` and `HEROKU_INFERENCE_URL` environment variables to set the API key and base URL, respectively: ```bash export HEROKU_INFERENCE_KEY='your-heroku-inference-key' export HEROKU_INFERENCE_URL='https://us.inference.heroku.com' ``` ### Cerebras To use [Cerebras](https://cerebras.ai/), you need to create an API key in the [Cerebras Console](https://cloud.cerebras.ai/). Once you've set the `CEREBRAS_API_KEY` environment variable, you can run the following: ```python from pydantic_ai import Agent agent = Agent('cerebras:llama3.3-70b') result = agent.run_sync('What is the capital of France?') print(result.output) #> The capital of France is Paris. ``` If you need to configure the provider, you can use the CerebrasProvider class: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.cerebras import CerebrasProvider model = OpenAIChatModel( 'llama3.3-70b', provider=CerebrasProvider(api_key='your-cerebras-api-key'), ) agent = Agent(model) result = agent.run_sync('What is the capital of France?') print(result.output) #> The capital of France is Paris. ``` ### LiteLLM To use [LiteLLM](https://www.litellm.ai/), set the configs as outlined in the [doc](https://docs.litellm.ai/docs/set_keys). In `LiteLLMProvider`, you can pass `api_base` and `api_key`. The value of these configs will depend on your setup. For example, if you are using OpenAI models, then you need to pass `https://api.openai.com/v1` as the `api_base` and your OpenAI API key as the `api_key`. If you are using a LiteLLM proxy server running on your local machine, then you need to pass `http://localhost:` as the `api_base` and your LiteLLM API key (or a placeholder) as the `api_key`. To use custom LLMs, use `custom/` prefix in the model name. Once you have the configs, use the LiteLLMProvider as follows: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.litellm import LiteLLMProvider model = OpenAIChatModel( 'openai/gpt-3.5-turbo', provider=LiteLLMProvider( api_base='', api_key='' ) ) agent = Agent(model) result = agent.run_sync('What is the capital of France?') print(result.output) #> The capital of France is Paris. ... ``` # Model Providers Pydantic AI is model-agnostic and has built-in support for multiple model providers: - [OpenAI](../openai/) - [Anthropic](../anthropic/) - [Gemini](../google/) (via two different APIs: Generative Language API and VertexAI API) - [Groq](../groq/) - [Mistral](../mistral/) - [Cohere](../cohere/) - [Bedrock](../bedrock/) - [Hugging Face](../huggingface/) ## OpenAI-compatible Providers In addition, many providers are compatible with the OpenAI API, and can be used with `OpenAIChatModel` in Pydantic AI: - [DeepSeek](../openai/#deepseek) - [Grok (xAI)](../openai/#grok-xai) - [Ollama](../openai/#ollama) - [OpenRouter](../openai/#openrouter) - [Vercel AI Gateway](../openai/#vercel-ai-gateway) - [Perplexity](../openai/#perplexity) - [Fireworks AI](../openai/#fireworks-ai) - [Together AI](../openai/#together-ai) - [Azure AI Foundry](../openai/#azure-ai-foundry) - [Heroku](../openai/#heroku-ai) - [GitHub Models](../openai/#github-models) - [Cerebras](../openai/#cerebras) - [LiteLLM](../openai/#litellm) Pydantic AI also comes with [`TestModel`](../../api/models/test/) and [`FunctionModel`](../../api/models/function/) for testing and development. To use each model provider, you need to configure your local environment and make sure you have the right packages installed. If you try to use the model without having done so, you'll be told what to install. ## Models and Providers Pydantic AI uses a few key terms to describe how it interacts with different LLMs: - **Model**: This refers to the Pydantic AI class used to make requests following a specific LLM API (generally by wrapping a vendor-provided SDK, like the `openai` python SDK). These classes implement a vendor-SDK-agnostic API, ensuring a single Pydantic AI agent is portable to different LLM vendors without any other code changes just by swapping out the Model it uses. Model classes are named roughly in the format `Model`, for example, we have `OpenAIChatModel`, `AnthropicModel`, `GoogleModel`, etc. When using a Model class, you specify the actual LLM model name (e.g., `gpt-4o`, `claude-3-5-sonnet-latest`, `gemini-1.5-flash`) as a parameter. - **Provider**: This refers to provider-specific classes which handle the authentication and connections to an LLM vendor. Passing a non-default *Provider* as a parameter to a Model is how you can ensure that your agent will make requests to a specific endpoint, or make use of a specific approach to authentication (e.g., you can use Azure auth with the `OpenAIChatModel` by way of the `AzureProvider`). In particular, this is how you can make use of an AI gateway, or an LLM vendor that offers API compatibility with the vendor SDK used by an existing Model (such as `OpenAIChatModel`). - **Profile**: This refers to a description of how requests to a specific model or family of models need to be constructed to get the best results, independent of the model and provider classes used. For example, different models have different restrictions on the JSON schemas that can be used for tools, and the same schema transformer needs to be used for Gemini models whether you're using `GoogleModel` with model name `gemini-2.5-pro-preview`, or `OpenAIChatModel` with `OpenRouterProvider` and model name `google/gemini-2.5-pro-preview`. When you instantiate an Agent with just a name formatted as `:`, e.g. `openai:gpt-4o` or `openrouter:google/gemini-2.5-pro-preview`, Pydantic AI will automatically select the appropriate model class, provider, and profile. If you want to use a different provider or profile, you can instantiate a model class directly and pass in `provider` and/or `profile` arguments. ## Custom Models Note If a model API is compatible with the OpenAI API, you do not need a custom model class and can provide your own [custom provider](../openai/#openai-compatible-models) instead. To implement support for a model API that's not already supported, you will need to subclass the Model abstract base class. For streaming, you'll also need to implement the StreamedResponse abstract base class. The best place to start is to review the source code for existing implementations, e.g. [`OpenAIChatModel`](https://github.com/pydantic/pydantic-ai/blob/main/pydantic_ai_slim/pydantic_ai/models/openai.py). For details on when we'll accept contributions adding new models to Pydantic AI, see the [contributing guidelines](../../contributing/#new-model-rules). ## Fallback Model You can use FallbackModel to attempt multiple models in sequence until one successfully returns a result. Under the hood, Pydantic AI automatically switches from one model to the next if the current model returns a 4xx or 5xx status code. In the following example, the agent first makes a request to the OpenAI model (which fails due to an invalid API key), and then falls back to the Anthropic model. fallback_model.py ```python from pydantic_ai import Agent from pydantic_ai.models.anthropic import AnthropicModel from pydantic_ai.models.fallback import FallbackModel from pydantic_ai.models.openai import OpenAIChatModel openai_model = OpenAIChatModel('gpt-4o') anthropic_model = AnthropicModel('claude-3-5-sonnet-latest') fallback_model = FallbackModel(openai_model, anthropic_model) agent = Agent(fallback_model) response = agent.run_sync('What is the capital of France?') print(response.data) #> Paris print(response.all_messages()) """ [ ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), part_kind='user-prompt', ) ], kind='request', ), ModelResponse( parts=[TextPart(content='Paris', part_kind='text')], model_name='claude-3-5-sonnet-latest', timestamp=datetime.datetime(...), kind='response', provider_response_id=None, ), ] """ ``` The `ModelResponse` message above indicates in the `model_name` field that the output was returned by the Anthropic model, which is the second model specified in the `FallbackModel`. Note Each model's options should be configured individually. For example, `base_url`, `api_key`, and custom clients should be set on each model itself, not on the `FallbackModel`. ### Per-Model Settings You can configure different ModelSettings for each model in a fallback chain by passing the `settings` parameter when creating each model. This is particularly useful when different providers have different optimal configurations: fallback_model_per_settings.py ```python from pydantic_ai import Agent, ModelSettings from pydantic_ai.models.anthropic import AnthropicModel from pydantic_ai.models.fallback import FallbackModel from pydantic_ai.models.openai import OpenAIChatModel # Configure each model with provider-specific optimal settings openai_model = OpenAIChatModel( 'gpt-4o', settings=ModelSettings(temperature=0.7, max_tokens=1000) # Higher creativity for OpenAI ) anthropic_model = AnthropicModel( 'claude-3-5-sonnet-latest', settings=ModelSettings(temperature=0.2, max_tokens=1000) # Lower temperature for consistency ) fallback_model = FallbackModel(openai_model, anthropic_model) agent = Agent(fallback_model) result = agent.run_sync('Write a creative story about space exploration') print(result.output) """ In the year 2157, Captain Maya Chen piloted her spacecraft through the vast expanse of the Andromeda Galaxy. As she discovered a planet with crystalline mountains that sang in harmony with the cosmic winds, she realized that space exploration was not just about finding new worlds, but about finding new ways to understand the universe and our place within it. """ ``` In this example, if the OpenAI model fails, the agent will automatically fall back to the Anthropic model with its own configured settings. The `FallbackModel` itself doesn't have settings - it uses the individual settings of whichever model successfully handles the request. In this next example, we demonstrate the exception-handling capabilities of `FallbackModel`. If all models fail, a FallbackExceptionGroup is raised, which contains all the exceptions encountered during the `run` execution. fallback_model_failure.py ```python from pydantic_ai import Agent, ModelHTTPError from pydantic_ai.models.anthropic import AnthropicModel from pydantic_ai.models.fallback import FallbackModel from pydantic_ai.models.openai import OpenAIChatModel openai_model = OpenAIChatModel('gpt-4o') anthropic_model = AnthropicModel('claude-3-5-sonnet-latest') fallback_model = FallbackModel(openai_model, anthropic_model) agent = Agent(fallback_model) try: response = agent.run_sync('What is the capital of France?') except* ModelHTTPError as exc_group: for exc in exc_group.exceptions: print(exc) ``` Since [`except*`](https://docs.python.org/3/reference/compound_stmts.html#except-star) is only supported in Python 3.11+, we use the [`exceptiongroup`](https://github.com/agronholm/exceptiongroup) backport package for earlier Python versions: fallback_model_failure.py ```python from exceptiongroup import catch from pydantic_ai import Agent, ModelHTTPError from pydantic_ai.models.anthropic import AnthropicModel from pydantic_ai.models.fallback import FallbackModel from pydantic_ai.models.openai import OpenAIChatModel def model_status_error_handler(exc_group: BaseExceptionGroup) -> None: for exc in exc_group.exceptions: print(exc) openai_model = OpenAIChatModel('gpt-4o') anthropic_model = AnthropicModel('claude-3-5-sonnet-latest') fallback_model = FallbackModel(openai_model, anthropic_model) agent = Agent(fallback_model) with catch({ModelHTTPError: model_status_error_handler}): response = agent.run_sync('What is the capital of France?') ``` By default, the `FallbackModel` only moves on to the next model if the current model raises a ModelHTTPError. You can customize this behavior by passing a custom `fallback_on` argument to the `FallbackModel` constructor. # Graphs # Graphs Don't use a nail gun unless you need a nail gun If Pydantic AI [agents](../agents/) are a hammer, and [multi-agent workflows](../multi-agent-applications/) are a sledgehammer, then graphs are a nail gun: - sure, nail guns look cooler than hammers - but nail guns take a lot more setup than hammers - and nail guns don't make you a better builder, they make you a builder with a nail gun - Lastly, (and at the risk of torturing this metaphor), if you're a fan of medieval tools like mallets and untyped Python, you probably won't like nail guns or our approach to graphs. (But then again, if you're not a fan of type hints in Python, you've probably already bounced off Pydantic AI to use one of the toy agent frameworks — good luck, and feel free to borrow my sledgehammer when you realize you need it) In short, graphs are a powerful tool, but they're not the right tool for every job. Please consider other [multi-agent approaches](../multi-agent-applications/) before proceeding. If you're not confident a graph-based approach is a good idea, it might be unnecessary. Graphs and finite state machines (FSMs) are a powerful abstraction to model, execute, control and visualize complex workflows. Alongside Pydantic AI, we've developed `pydantic-graph` — an async graph and state machine library for Python where nodes and edges are defined using type hints. While this library is developed as part of Pydantic AI; it has no dependency on `pydantic-ai` and can be considered as a pure graph-based state machine library. You may find it useful whether or not you're using Pydantic AI or even building with GenAI. `pydantic-graph` is designed for advanced users and makes heavy use of Python generics and type hints. It is not designed to be as beginner-friendly as Pydantic AI. ## Installation `pydantic-graph` is a required dependency of `pydantic-ai`, and an optional dependency of `pydantic-ai-slim`, see [installation instructions](../install/#slim-install) for more information. You can also install it directly: ```bash pip install pydantic-graph ``` ```bash uv add pydantic-graph ``` ## Graph Types `pydantic-graph` is made up of a few key components: ### GraphRunContext GraphRunContext — The context for the graph run, similar to Pydantic AI's RunContext. This holds the state of the graph and dependencies and is passed to nodes when they're run. `GraphRunContext` is generic in the state type of the graph it's used in, StateT. ### End End — return value to indicate the graph run should end. `End` is generic in the graph return type of the graph it's used in, RunEndT. ### Nodes Subclasses of BaseNode define nodes for execution in the graph. Nodes, which are generally dataclasses, generally consist of: - fields containing any parameters required/optional when calling the node - the business logic to execute the node, in the run method - return annotations of the run method, which are read by `pydantic-graph` to determine the outgoing edges of the node Nodes are generic in: - **state**, which must have the same type as the state of graphs they're included in, StateT has a default of `None`, so if you're not using state you can omit this generic parameter, see [stateful graphs](#stateful-graphs) for more information - **deps**, which must have the same type as the deps of the graph they're included in, DepsT has a default of `None`, so if you're not using deps you can omit this generic parameter, see [dependency injection](#dependency-injection) for more information - **graph return type** — this only applies if the node returns End. RunEndT has a default of Never so this generic parameter can be omitted if the node doesn't return `End`, but must be included if it does. Here's an example of a start or intermediate node in a graph — it can't end the run as it doesn't return End: intermediate_node.py ```py from dataclasses import dataclass from pydantic_graph import BaseNode, GraphRunContext @dataclass class MyNode(BaseNode[MyState]): # (1)! foo: int # (2)! async def run( self, ctx: GraphRunContext[MyState], # (3)! ) -> AnotherNode: # (4)! ... return AnotherNode() ``` 1. State in this example is `MyState` (not shown), hence `BaseNode` is parameterized with `MyState`. This node can't end the run, so the `RunEndT` generic parameter is omitted and defaults to `Never`. 1. `MyNode` is a dataclass and has a single field `foo`, an `int`. 1. The `run` method takes a `GraphRunContext` parameter, again parameterized with state `MyState`. 1. The return type of the `run` method is `AnotherNode` (not shown), this is used to determine the outgoing edges of the node. We could extend `MyNode` to optionally end the run if `foo` is divisible by 5: intermediate_or_end_node.py ```py from dataclasses import dataclass from pydantic_graph import BaseNode, End, GraphRunContext @dataclass class MyNode(BaseNode[MyState, None, int]): # (1)! foo: int async def run( self, ctx: GraphRunContext[MyState], ) -> AnotherNode | End[int]: # (2)! if self.foo % 5 == 0: return End(self.foo) else: return AnotherNode() ``` 1. We parameterize the node with the return type (`int` in this case) as well as state. Because generic parameters are positional-only, we have to include `None` as the second parameter representing deps. 1. The return type of the `run` method is now a union of `AnotherNode` and `End[int]`, this allows the node to end the run if `foo` is divisible by 5. ### Graph Graph — this is the execution graph itself, made up of a set of [node classes](#nodes) (i.e., `BaseNode` subclasses). `Graph` is generic in: - **state** the state type of the graph, StateT - **deps** the deps type of the graph, DepsT - **graph return type** the return type of the graph run, RunEndT Here's an example of a simple graph: graph_example.py ```py from __future__ import annotations from dataclasses import dataclass from pydantic_graph import BaseNode, End, Graph, GraphRunContext @dataclass class DivisibleBy5(BaseNode[None, None, int]): # (1)! foo: int async def run( self, ctx: GraphRunContext, ) -> Increment | End[int]: if self.foo % 5 == 0: return End(self.foo) else: return Increment(self.foo) @dataclass class Increment(BaseNode): # (2)! foo: int async def run(self, ctx: GraphRunContext) -> DivisibleBy5: return DivisibleBy5(self.foo + 1) fives_graph = Graph(nodes=[DivisibleBy5, Increment]) # (3)! result = fives_graph.run_sync(DivisibleBy5(4)) # (4)! print(result.output) #> 5 ``` 1. The `DivisibleBy5` node is parameterized with `None` for the state param and `None` for the deps param as this graph doesn't use state or deps, and `int` as it can end the run. 1. The `Increment` node doesn't return `End`, so the `RunEndT` generic parameter is omitted, state can also be omitted as the graph doesn't use state. 1. The graph is created with a sequence of nodes. 1. The graph is run synchronously with run_sync. The initial node is `DivisibleBy5(4)`. Because the graph doesn't use external state or deps, we don't pass `state` or `deps`. *(This example is complete, it can be run "as is")* A [mermaid diagram](#mermaid-diagrams) for this graph can be generated with the following code: graph_example_diagram.py ```py from graph_example import DivisibleBy5, fives_graph fives_graph.mermaid_code(start_node=DivisibleBy5) ``` ``` --- title: fives_graph --- stateDiagram-v2 [*] --> DivisibleBy5 DivisibleBy5 --> Increment DivisibleBy5 --> [*] Increment --> DivisibleBy5 ``` In order to visualize a graph within a `jupyter-notebook`, `IPython.display` needs to be used: jupyter_display_mermaid.py ```python from graph_example import DivisibleBy5, fives_graph from IPython.display import Image, display display(Image(fives_graph.mermaid_image(start_node=DivisibleBy5))) ``` ## Stateful Graphs The "state" concept in `pydantic-graph` provides an optional way to access and mutate an object (often a `dataclass` or Pydantic model) as nodes run in a graph. If you think of Graphs as a production line, then your state is the engine being passed along the line and built up by each node as the graph is run. `pydantic-graph` provides state persistence, with the state recorded after each node is run. (See [State Persistence](#state-persistence).) Here's an example of a graph which represents a vending machine where the user may insert coins and select a product to purchase. vending_machine.py ```python from __future__ import annotations from dataclasses import dataclass from rich.prompt import Prompt from pydantic_graph import BaseNode, End, Graph, GraphRunContext @dataclass class MachineState: # (1)! user_balance: float = 0.0 product: str | None = None @dataclass class InsertCoin(BaseNode[MachineState]): # (3)! async def run(self, ctx: GraphRunContext[MachineState]) -> CoinsInserted: # (16)! return CoinsInserted(float(Prompt.ask('Insert coins'))) # (4)! @dataclass class CoinsInserted(BaseNode[MachineState]): amount: float # (5)! async def run( self, ctx: GraphRunContext[MachineState] ) -> SelectProduct | Purchase: # (17)! ctx.state.user_balance += self.amount # (6)! if ctx.state.product is not None: # (7)! return Purchase(ctx.state.product) else: return SelectProduct() @dataclass class SelectProduct(BaseNode[MachineState]): async def run(self, ctx: GraphRunContext[MachineState]) -> Purchase: return Purchase(Prompt.ask('Select product')) PRODUCT_PRICES = { # (2)! 'water': 1.25, 'soda': 1.50, 'crisps': 1.75, 'chocolate': 2.00, } @dataclass class Purchase(BaseNode[MachineState, None, None]): # (18)! product: str async def run( self, ctx: GraphRunContext[MachineState] ) -> End | InsertCoin | SelectProduct: if price := PRODUCT_PRICES.get(self.product): # (8)! ctx.state.product = self.product # (9)! if ctx.state.user_balance >= price: # (10)! ctx.state.user_balance -= price return End(None) else: diff = price - ctx.state.user_balance print(f'Not enough money for {self.product}, need {diff:0.2f} more') #> Not enough money for crisps, need 0.75 more return InsertCoin() # (11)! else: print(f'No such product: {self.product}, try again') return SelectProduct() # (12)! vending_machine_graph = Graph( # (13)! nodes=[InsertCoin, CoinsInserted, SelectProduct, Purchase] ) async def main(): state = MachineState() # (14)! await vending_machine_graph.run(InsertCoin(), state=state) # (15)! print(f'purchase successful item={state.product} change={state.user_balance:0.2f}') #> purchase successful item=crisps change=0.25 ``` 1. The state of the vending machine is defined as a dataclass with the user's balance and the product they've selected, if any. 1. A dictionary of products mapped to prices. 1. The `InsertCoin` node, BaseNode is parameterized with `MachineState` as that's the state used in this graph. 1. The `InsertCoin` node prompts the user to insert coins. We keep things simple by just entering a monetary amount as a float. Before you start thinking this is a toy too since it's using rich's Prompt.ask within nodes, see [below](#example-human-in-the-loop) for how control flow can be managed when nodes require external input. 1. The `CoinsInserted` node; again this is a dataclass with one field `amount`. 1. Update the user's balance with the amount inserted. 1. If the user has already selected a product, go to `Purchase`, otherwise go to `SelectProduct`. 1. In the `Purchase` node, look up the price of the product if the user entered a valid product. 1. If the user did enter a valid product, set the product in the state so we don't revisit `SelectProduct`. 1. If the balance is enough to purchase the product, adjust the balance to reflect the purchase and return End to end the graph. We're not using the run return type, so we call `End` with `None`. 1. If the balance is insufficient, go to `InsertCoin` to prompt the user to insert more coins. 1. If the product is invalid, go to `SelectProduct` to prompt the user to select a product again. 1. The graph is created by passing a list of nodes to Graph. Order of nodes is not important, but it can affect how [diagrams](#mermaid-diagrams) are displayed. 1. Initialize the state. This will be passed to the graph run and mutated as the graph runs. 1. Run the graph with the initial state. Since the graph can be run from any node, we must pass the start node — in this case, `InsertCoin`. Graph.run returns a GraphRunResult that provides the final data and a history of the run. 1. The return type of the node's run method is important as it is used to determine the outgoing edges of the node. This information in turn is used to render [mermaid diagrams](#mermaid-diagrams) and is enforced at runtime to detect misbehavior as soon as possible. 1. The return type of `CoinsInserted`'s run method is a union, meaning multiple outgoing edges are possible. 1. Unlike other nodes, `Purchase` can end the run, so the RunEndT generic parameter must be set. In this case it's `None` since the graph run return type is `None`. *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* A [mermaid diagram](#mermaid-diagrams) for this graph can be generated with the following code: vending_machine_diagram.py ```py from vending_machine import InsertCoin, vending_machine_graph vending_machine_graph.mermaid_code(start_node=InsertCoin) ``` The diagram generated by the above code is: ``` --- title: vending_machine_graph --- stateDiagram-v2 [*] --> InsertCoin InsertCoin --> CoinsInserted CoinsInserted --> SelectProduct CoinsInserted --> Purchase SelectProduct --> Purchase Purchase --> InsertCoin Purchase --> SelectProduct Purchase --> [*] ``` See [below](#mermaid-diagrams) for more information on generating diagrams. ## GenAI Example So far we haven't shown an example of a Graph that actually uses Pydantic AI or GenAI at all. In this example, one agent generates a welcome email to a user and the other agent provides feedback on the email. This graph has a very simple structure: ``` --- title: feedback_graph --- stateDiagram-v2 [*] --> WriteEmail WriteEmail --> Feedback Feedback --> WriteEmail Feedback --> [*] ``` genai_email_feedback.py ```python from __future__ import annotations as _annotations from dataclasses import dataclass, field from pydantic import BaseModel, EmailStr from pydantic_ai import Agent, format_as_xml from pydantic_ai.messages import ModelMessage from pydantic_graph import BaseNode, End, Graph, GraphRunContext @dataclass class User: name: str email: EmailStr interests: list[str] @dataclass class Email: subject: str body: str @dataclass class State: user: User write_agent_messages: list[ModelMessage] = field(default_factory=list) email_writer_agent = Agent( 'google-gla:gemini-1.5-pro', output_type=Email, system_prompt='Write a welcome email to our tech blog.', ) @dataclass class WriteEmail(BaseNode[State]): email_feedback: str | None = None async def run(self, ctx: GraphRunContext[State]) -> Feedback: if self.email_feedback: prompt = ( f'Rewrite the email for the user:\n' f'{format_as_xml(ctx.state.user)}\n' f'Feedback: {self.email_feedback}' ) else: prompt = ( f'Write a welcome email for the user:\n' f'{format_as_xml(ctx.state.user)}' ) result = await email_writer_agent.run( prompt, message_history=ctx.state.write_agent_messages, ) ctx.state.write_agent_messages += result.new_messages() return Feedback(result.output) class EmailRequiresWrite(BaseModel): feedback: str class EmailOk(BaseModel): pass feedback_agent = Agent[None, EmailRequiresWrite | EmailOk]( 'openai:gpt-4o', output_type=EmailRequiresWrite | EmailOk, # type: ignore system_prompt=( 'Review the email and provide feedback, email must reference the users specific interests.' ), ) @dataclass class Feedback(BaseNode[State, None, Email]): email: Email async def run( self, ctx: GraphRunContext[State], ) -> WriteEmail | End[Email]: prompt = format_as_xml({'user': ctx.state.user, 'email': self.email}) result = await feedback_agent.run(prompt) if isinstance(result.output, EmailRequiresWrite): return WriteEmail(email_feedback=result.output.feedback) else: return End(self.email) async def main(): user = User( name='John Doe', email='john.joe@example.com', interests=['Haskel', 'Lisp', 'Fortran'], ) state = State(user) feedback_graph = Graph(nodes=(WriteEmail, Feedback)) result = await feedback_graph.run(WriteEmail(), state=state) print(result.output) """ Email( subject='Welcome to our tech blog!', body='Hello John, Welcome to our tech blog! ...', ) """ ``` *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* ## Iterating Over a Graph ### Using `Graph.iter` for `async for` iteration Sometimes you want direct control or insight into each node as the graph executes. The easiest way to do that is with the Graph.iter method, which returns a **context manager** that yields a GraphRun object. The `GraphRun` is an async-iterable over the nodes of your graph, allowing you to record or modify them as they execute. Here's an example: count_down.py ```python from __future__ import annotations as _annotations from dataclasses import dataclass from pydantic_graph import Graph, BaseNode, End, GraphRunContext @dataclass class CountDownState: counter: int @dataclass class CountDown(BaseNode[CountDownState, None, int]): async def run(self, ctx: GraphRunContext[CountDownState]) -> CountDown | End[int]: if ctx.state.counter <= 0: return End(ctx.state.counter) ctx.state.counter -= 1 return CountDown() count_down_graph = Graph(nodes=[CountDown]) async def main(): state = CountDownState(counter=3) async with count_down_graph.iter(CountDown(), state=state) as run: # (1)! async for node in run: # (2)! print('Node:', node) #> Node: CountDown() #> Node: CountDown() #> Node: CountDown() #> Node: CountDown() #> Node: End(data=0) print('Final output:', run.result.output) # (3)! #> Final output: 0 ``` 1. `Graph.iter(...)` returns a GraphRun. 1. Here, we step through each node as it is executed. 1. Once the graph returns an End, the loop ends, and `run.result` becomes a GraphRunResult containing the final outcome (`0` here). ### Using `GraphRun.next(node)` manually Alternatively, you can drive iteration manually with the GraphRun.next method, which allows you to pass in whichever node you want to run next. You can modify or selectively skip nodes this way. Below is a contrived example that stops whenever the counter is at 2, ignoring any node runs beyond that: count_down_next.py ```python from pydantic_graph import End, FullStatePersistence from count_down import CountDown, CountDownState, count_down_graph async def main(): state = CountDownState(counter=5) persistence = FullStatePersistence() # (7)! async with count_down_graph.iter( CountDown(), state=state, persistence=persistence ) as run: node = run.next_node # (1)! while not isinstance(node, End): # (2)! print('Node:', node) #> Node: CountDown() #> Node: CountDown() #> Node: CountDown() #> Node: CountDown() if state.counter == 2: break # (3)! node = await run.next(node) # (4)! print(run.result) # (5)! #> None for step in persistence.history: # (6)! print('History Step:', step.state, step.state) #> History Step: CountDownState(counter=5) CountDownState(counter=5) #> History Step: CountDownState(counter=4) CountDownState(counter=4) #> History Step: CountDownState(counter=3) CountDownState(counter=3) #> History Step: CountDownState(counter=2) CountDownState(counter=2) ``` 1. We start by grabbing the first node that will be run in the agent's graph. 1. The agent run is finished once an `End` node has been produced; instances of `End` cannot be passed to `next`. 1. If the user decides to stop early, we break out of the loop. The graph run won't have a real final result in that case (`run.result` remains `None`). 1. At each step, we call `await run.next(node)` to run it and get the next node (or an `End`). 1. Because we did not continue the run until it finished, the `result` is not set. 1. The run's history is still populated with the steps we executed so far. 1. Use FullStatePersistence so we can show the history of the run, see [State Persistence](#state-persistence) below for more information. ## State Persistence One of the biggest benefits of finite state machine (FSM) graphs is how they simplify the handling of interrupted execution. This might happen for a variety of reasons: - the state machine logic might fundamentally need to be paused — e.g. the returns workflow for an e-commerce order needs to wait for the item to be posted to the returns center or because execution of the next node needs input from a user so needs to wait for a new http request, - the execution takes so long that the entire graph can't reliably be executed in a single continuous run — e.g. a deep research agent that might take hours to run, - you want to run multiple graph nodes in parallel in different processes / hardware instances (note: parallel node execution is not yet supported in `pydantic-graph`, see [#704](https://github.com/pydantic/pydantic-ai/issues/704)). Trying to make a conventional control flow (i.e., boolean logic and nested function calls) implementation compatible with these usage scenarios generally results in brittle and over-complicated spaghetti code, with the logic required to interrupt and resume execution dominating the implementation. To allow graph runs to be interrupted and resumed, `pydantic-graph` provides state persistence — a system for snapshotting the state of a graph run before and after each node is run, allowing a graph run to be resumed from any point in the graph. `pydantic-graph` includes three state persistence implementations: - SimpleStatePersistence — Simple in memory state persistence that just hold the latest snapshot. If no state persistence implementation is provided when running a graph, this is used by default. - FullStatePersistence — In memory state persistence that hold a list of snapshots. - FileStatePersistence — File-based state persistence that saves snapshots to a JSON file. In production applications, developers should implement their own state persistence by subclassing BaseStatePersistence abstract base class, which might persist runs in a relational database like PostgresQL. At a high level the role of `StatePersistence` implementations is to store and retrieve NodeSnapshot and EndSnapshot objects. graph.iter_from_persistence() may be used to run the graph based on the state stored in persistence. We can run the `count_down_graph` from [above](#iterating-over-a-graph), using graph.iter_from_persistence() and FileStatePersistence. As you can see in this code, `run_node` requires no external application state (apart from state persistence) to be run, meaning graphs can easily be executed by distributed execution and queueing systems. count_down_from_persistence.py ```python from pathlib import Path from pydantic_graph import End from pydantic_graph.persistence.file import FileStatePersistence from count_down import CountDown, CountDownState, count_down_graph async def main(): run_id = 'run_abc123' persistence = FileStatePersistence(Path(f'count_down_{run_id}.json')) # (1)! state = CountDownState(counter=5) await count_down_graph.initialize( # (2)! CountDown(), state=state, persistence=persistence ) done = False while not done: done = await run_node(run_id) async def run_node(run_id: str) -> bool: # (3)! persistence = FileStatePersistence(Path(f'count_down_{run_id}.json')) async with count_down_graph.iter_from_persistence(persistence) as run: # (4)! node_or_end = await run.next() # (5)! print('Node:', node_or_end) #> Node: CountDown() #> Node: CountDown() #> Node: CountDown() #> Node: CountDown() #> Node: CountDown() #> Node: End(data=0) return isinstance(node_or_end, End) # (6)! ``` 1. Create a FileStatePersistence to use to start the graph. 1. Call graph.initialize() to set the initial graph state in the persistence object. 1. `run_node` is a pure function that doesn't need access to any other process state to run the next node of the graph, except the ID of the run. 1. Call graph.iter_from_persistence() create a GraphRun object that will run the next node of the graph from the state stored in persistence. This will return either a node or an `End` object. 1. graph.run() will return either a node or an End object. 1. Check if the node is an End object, if it is, the graph run is complete. *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* ### Example: Human in the loop. As noted above, state persistence allows graphs to be interrupted and resumed. One use case of this is to allow user input to continue. In this example, an AI asks the user a question, the user provides an answer, the AI evaluates the answer and ends if the user got it right or asks another question if they got it wrong. Instead of running the entire graph in a single process invocation, we run the graph by running the process repeatedly, optionally providing an answer to the question as a command line argument. `ai_q_and_a_graph.py` — `question_graph` definition ai_q_and_a_graph.py ```python from __future__ import annotations as _annotations from typing import Annotated from pydantic_graph import Edge from dataclasses import dataclass, field from pydantic import BaseModel from pydantic_graph import ( BaseNode, End, Graph, GraphRunContext, ) from pydantic_ai import Agent, format_as_xml from pydantic_ai.messages import ModelMessage ask_agent = Agent('openai:gpt-4o', output_type=str, instrument=True) @dataclass class QuestionState: question: str | None = None ask_agent_messages: list[ModelMessage] = field(default_factory=list) evaluate_agent_messages: list[ModelMessage] = field(default_factory=list) @dataclass class Ask(BaseNode[QuestionState]): """Generate question using GPT-4o.""" docstring_notes = True async def run( self, ctx: GraphRunContext[QuestionState] ) -> Annotated[Answer, Edge(label='Ask the question')]: result = await ask_agent.run( 'Ask a simple question with a single correct answer.', message_history=ctx.state.ask_agent_messages, ) ctx.state.ask_agent_messages += result.new_messages() ctx.state.question = result.output return Answer(result.output) @dataclass class Answer(BaseNode[QuestionState]): question: str async def run(self, ctx: GraphRunContext[QuestionState]) -> Evaluate: answer = input(f'{self.question}: ') return Evaluate(answer) class EvaluationResult(BaseModel, use_attribute_docstrings=True): correct: bool """Whether the answer is correct.""" comment: str """Comment on the answer, reprimand the user if the answer is wrong.""" evaluate_agent = Agent( 'openai:gpt-4o', output_type=EvaluationResult, system_prompt='Given a question and answer, evaluate if the answer is correct.', ) @dataclass class Evaluate(BaseNode[QuestionState, None, str]): answer: str async def run( self, ctx: GraphRunContext[QuestionState], ) -> Annotated[End[str], Edge(label='success')] | Reprimand: assert ctx.state.question is not None result = await evaluate_agent.run( format_as_xml({'question': ctx.state.question, 'answer': self.answer}), message_history=ctx.state.evaluate_agent_messages, ) ctx.state.evaluate_agent_messages += result.new_messages() if result.output.correct: return End(result.output.comment) else: return Reprimand(result.output.comment) @dataclass class Reprimand(BaseNode[QuestionState]): comment: str async def run(self, ctx: GraphRunContext[QuestionState]) -> Ask: print(f'Comment: {self.comment}') ctx.state.question = None return Ask() question_graph = Graph( nodes=(Ask, Answer, Evaluate, Reprimand), state_type=QuestionState ) ``` *(This example is complete, it can be run "as is")* ai_q_and_a_run.py ```python import sys from pathlib import Path from pydantic_graph import End from pydantic_graph.persistence.file import FileStatePersistence from pydantic_ai.messages import ModelMessage # noqa: F401 from ai_q_and_a_graph import Ask, question_graph, Evaluate, QuestionState, Answer async def main(): answer: str | None = sys.argv[1] if len(sys.argv) > 1 else None # (1)! persistence = FileStatePersistence(Path('question_graph.json')) # (2)! persistence.set_graph_types(question_graph) # (3)! if snapshot := await persistence.load_next(): # (4)! state = snapshot.state assert answer is not None node = Evaluate(answer) else: state = QuestionState() node = Ask() # (5)! async with question_graph.iter(node, state=state, persistence=persistence) as run: while True: node = await run.next() # (6)! if isinstance(node, End): # (7)! print('END:', node.data) history = await persistence.load_all() # (8)! print([e.node for e in history]) break elif isinstance(node, Answer): # (9)! print(node.question) #> What is the capital of France? break # otherwise just continue ``` 1. Get the user's answer from the command line, if provided. See [question graph example](../examples/question-graph/) for a complete example. 1. Create a state persistence instance the `'question_graph.json'` file may or may not already exist. 1. Since we're using the persistence interface outside a graph, we need to call set_graph_types to set the graph generic types `StateT` and `RunEndT` for the persistence instance. This is necessary to allow the persistence instance to know how to serialize and deserialize graph nodes. 1. If we're run the graph before, load_next will return a snapshot of the next node to run, here we use `state` from that snapshot, and create a new `Evaluate` node with the answer provided on the command line. 1. If the graph hasn't been run before, we create a new `QuestionState` and start with the `Ask` node. 1. Call GraphRun.next() to run the node. This will return either a node or an `End` object. 1. If the node is an `End` object, the graph run is complete. The `data` field of the `End` object contains the comment returned by the `evaluate_agent` about the correct answer. 1. To demonstrate the state persistence, we call load_all to get all the snapshots from the persistence instance. This will return a list of Snapshot objects. 1. If the node is an `Answer` object, we print the question and break out of the loop to end the process and wait for user input. *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* For a complete example of this graph, see the [question graph example](../examples/question-graph/). ## Dependency Injection As with Pydantic AI, `pydantic-graph` supports dependency injection via a generic parameter on Graph and BaseNode, and the GraphRunContext.deps field. As an example of dependency injection, let's modify the `DivisibleBy5` example [above](#graph) to use a ProcessPoolExecutor to run the compute load in a separate process (this is a contrived example, `ProcessPoolExecutor` wouldn't actually improve performance in this example): deps_example.py ```py from __future__ import annotations import asyncio from concurrent.futures import ProcessPoolExecutor from dataclasses import dataclass from pydantic_graph import BaseNode, End, FullStatePersistence, Graph, GraphRunContext @dataclass class GraphDeps: executor: ProcessPoolExecutor @dataclass class DivisibleBy5(BaseNode[None, GraphDeps, int]): foo: int async def run( self, ctx: GraphRunContext[None, GraphDeps], ) -> Increment | End[int]: if self.foo % 5 == 0: return End(self.foo) else: return Increment(self.foo) @dataclass class Increment(BaseNode[None, GraphDeps]): foo: int async def run(self, ctx: GraphRunContext[None, GraphDeps]) -> DivisibleBy5: loop = asyncio.get_running_loop() compute_result = await loop.run_in_executor( ctx.deps.executor, self.compute, ) return DivisibleBy5(compute_result) def compute(self) -> int: return self.foo + 1 fives_graph = Graph(nodes=[DivisibleBy5, Increment]) async def main(): with ProcessPoolExecutor() as executor: deps = GraphDeps(executor) result = await fives_graph.run(DivisibleBy5(3), deps=deps, persistence=FullStatePersistence()) print(result.output) #> 5 # the full history is quite verbose (see below), so we'll just print the summary print([item.node for item in result.persistence.history]) """ [ DivisibleBy5(foo=3), Increment(foo=3), DivisibleBy5(foo=4), Increment(foo=4), DivisibleBy5(foo=5), End(data=5), ] """ ``` *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* ## Mermaid Diagrams Pydantic Graph can generate [mermaid](https://mermaid.js.org/) [`stateDiagram-v2`](https://mermaid.js.org/syntax/stateDiagram.html) diagrams for graphs, as shown above. These diagrams can be generated with: - Graph.mermaid_code to generate the mermaid code for a graph - Graph.mermaid_image to generate an image of the graph using [mermaid.ink](https://mermaid.ink/) - Graph.mermaid_save to generate an image of the graph using [mermaid.ink](https://mermaid.ink/) and save it to a file Beyond the diagrams shown above, you can also customize mermaid diagrams with the following options: - Edge allows you to apply a label to an edge - BaseNode.docstring_notes and BaseNode.get_note allows you to add notes to nodes - The highlighted_nodes parameter allows you to highlight specific node(s) in the diagram Putting that together, we can edit the last [`ai_q_and_a_graph.py`](#example-human-in-the-loop) example to: - add labels to some edges - add a note to the `Ask` node - highlight the `Answer` node - save the diagram as a `PNG` image to file ai_q_and_a_graph_extra.py ```python from typing import Annotated from pydantic_graph import BaseNode, End, Graph, GraphRunContext, Edge ask_agent = Agent('openai:gpt-4o', output_type=str, instrument=True) @dataclass class QuestionState: question: str | None = None ask_agent_messages: list[ModelMessage] = field(default_factory=list) evaluate_agent_messages: list[ModelMessage] = field(default_factory=list) @dataclass class Ask(BaseNode[QuestionState]): """Generate question using GPT-4o.""" docstring_notes = True async def run( self, ctx: GraphRunContext[QuestionState] ) -> Annotated[Answer, Edge(label='Ask the question')]: result = await ask_agent.run( 'Ask a simple question with a single correct answer.', message_history=ctx.state.ask_agent_messages, ) ctx.state.ask_agent_messages += result.new_messages() ctx.state.question = result.output return Answer(result.output) @dataclass class Answer(BaseNode[QuestionState]): question: str async def run(self, ctx: GraphRunContext[QuestionState]) -> Evaluate: answer = input(f'{self.question}: ') return Evaluate(answer) class EvaluationResult(BaseModel, use_attribute_docstrings=True): correct: bool """Whether the answer is correct.""" comment: str """Comment on the answer, reprimand the user if the answer is wrong.""" evaluate_agent = Agent( 'openai:gpt-4o', output_type=EvaluationResult, system_prompt='Given a question and answer, evaluate if the answer is correct.', ) @dataclass class Evaluate(BaseNode[QuestionState, None, str]): answer: str async def run( self, ctx: GraphRunContext[QuestionState], ) -> Annotated[End[str], Edge(label='success')] | Reprimand: assert ctx.state.question is not None result = await evaluate_agent.run( format_as_xml({'question': ctx.state.question, 'answer': self.answer}), message_history=ctx.state.evaluate_agent_messages, ) ctx.state.evaluate_agent_messages += result.new_messages() if result.output.correct: return End(result.output.comment) else: return Reprimand(result.output.comment) @dataclass class Reprimand(BaseNode[QuestionState]): comment: str async def run(self, ctx: GraphRunContext[QuestionState]) -> Ask: print(f'Comment: {self.comment}') ctx.state.question = None return Ask() question_graph = Graph( nodes=(Ask, Answer, Evaluate, Reprimand), state_type=QuestionState ) ``` *(This example is not complete and cannot be run directly)* This would generate an image that looks like this: ``` --- title: question_graph --- stateDiagram-v2 Ask --> Answer: Ask the question note right of Ask Judge the answer. Decide on next step. end note Answer --> Evaluate Evaluate --> Reprimand Evaluate --> [*]: success Reprimand --> Ask classDef highlighted fill:#fdff32 class Answer highlighted ``` ### Setting Direction of the State Diagram You can specify the direction of the state diagram using one of the following values: - `'TB'`: Top to bottom, the diagram flows vertically from top to bottom. - `'LR'`: Left to right, the diagram flows horizontally from left to right. - `'RL'`: Right to left, the diagram flows horizontally from right to left. - `'BT'`: Bottom to top, the diagram flows vertically from bottom to top. Here is an example of how to do this using 'Left to Right' (LR) instead of the default 'Top to Bottom' (TB): vending_machine_diagram.py ```py from vending_machine import InsertCoin, vending_machine_graph vending_machine_graph.mermaid_code(start_node=InsertCoin, direction='LR') ``` ``` --- title: vending_machine_graph --- stateDiagram-v2 direction LR [*] --> InsertCoin InsertCoin --> CoinsInserted CoinsInserted --> SelectProduct CoinsInserted --> Purchase SelectProduct --> Purchase Purchase --> InsertCoin Purchase --> SelectProduct Purchase --> [*] ``` # API Reference # `pydantic_ai.ag_ui` Provides an AG-UI protocol adapter for the Pydantic AI agent. This package provides seamless integration between pydantic-ai agents and ag-ui for building interactive AI applications with streaming event-based communication. ### SSE_CONTENT_TYPE ```python SSE_CONTENT_TYPE: Final[str] = 'text/event-stream' ``` Content type header value for Server-Sent Events (SSE). ### OnCompleteFunc ```python OnCompleteFunc: TypeAlias = ( Callable[[AgentRunResult[Any]], None] | Callable[[AgentRunResult[Any]], Awaitable[None]] ) ``` Callback function type that receives the `AgentRunResult` of the completed run. Can be sync or async. ### AGUIApp Bases: `Generic[AgentDepsT, OutputDataT]`, `Starlette` ASGI application for running Pydantic AI agents with AG-UI protocol support. Source code in `pydantic_ai_slim/pydantic_ai/ag_ui.py` ```python class AGUIApp(Generic[AgentDepsT, OutputDataT], Starlette): """ASGI application for running Pydantic AI agents with AG-UI protocol support.""" def __init__( self, agent: AbstractAgent[AgentDepsT, OutputDataT], *, # Agent.iter parameters. output_type: OutputSpec[Any] | None = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, # Starlette parameters. debug: bool = False, routes: Sequence[BaseRoute] | None = None, middleware: Sequence[Middleware] | None = None, exception_handlers: Mapping[Any, ExceptionHandler] | None = None, on_startup: Sequence[Callable[[], Any]] | None = None, on_shutdown: Sequence[Callable[[], Any]] | None = None, lifespan: Lifespan[AGUIApp[AgentDepsT, OutputDataT]] | None = None, ) -> None: """An ASGI application that handles every AG-UI request by running the agent. Note that the `deps` will be the same for each request, with the exception of the AG-UI state that's injected into the `state` field of a `deps` object that implements the [`StateHandler`][pydantic_ai.ag_ui.StateHandler] protocol. To provide different `deps` for each request (e.g. based on the authenticated user), use [`pydantic_ai.ag_ui.run_ag_ui`][pydantic_ai.ag_ui.run_ag_ui] or [`pydantic_ai.ag_ui.handle_ag_ui_request`][pydantic_ai.ag_ui.handle_ag_ui_request] instead. Args: agent: The agent to run. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. debug: Boolean indicating if debug tracebacks should be returned on errors. routes: A list of routes to serve incoming HTTP and WebSocket requests. middleware: A list of middleware to run for every request. A starlette application will always automatically include two middleware classes. `ServerErrorMiddleware` is added as the very outermost middleware, to handle any uncaught errors occurring anywhere in the entire stack. `ExceptionMiddleware` is added as the very innermost middleware, to deal with handled exception cases occurring in the routing or endpoints. exception_handlers: A mapping of either integer status codes, or exception class types onto callables which handle the exceptions. Exception handler callables should be of the form `handler(request, exc) -> response` and may be either standard functions, or async functions. on_startup: A list of callables to run on application startup. Startup handler callables do not take any arguments, and may be either standard functions, or async functions. on_shutdown: A list of callables to run on application shutdown. Shutdown handler callables do not take any arguments, and may be either standard functions, or async functions. lifespan: A lifespan context function, which can be used to perform startup and shutdown tasks. This is a newer style that replaces the `on_startup` and `on_shutdown` handlers. Use one or the other, not both. """ super().__init__( debug=debug, routes=routes, middleware=middleware, exception_handlers=exception_handlers, on_startup=on_startup, on_shutdown=on_shutdown, lifespan=lifespan, ) async def endpoint(request: Request) -> Response: """Endpoint to run the agent with the provided input data.""" return await handle_ag_ui_request( agent, request, output_type=output_type, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, ) self.router.add_route('/', endpoint, methods=['POST'], name='run_agent') ``` #### __init__ ```python __init__( agent: AbstractAgent[AgentDepsT, OutputDataT], *, output_type: OutputSpec[Any] | None = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, debug: bool = False, routes: Sequence[BaseRoute] | None = None, middleware: Sequence[Middleware] | None = None, exception_handlers: ( Mapping[Any, ExceptionHandler] | None ) = None, on_startup: Sequence[Callable[[], Any]] | None = None, on_shutdown: Sequence[Callable[[], Any]] | None = None, lifespan: ( Lifespan[AGUIApp[AgentDepsT, OutputDataT]] | None ) = None ) -> None ``` An ASGI application that handles every AG-UI request by running the agent. Note that the `deps` will be the same for each request, with the exception of the AG-UI state that's injected into the `state` field of a `deps` object that implements the StateHandler protocol. To provide different `deps` for each request (e.g. based on the authenticated user), use pydantic_ai.ag_ui.run_ag_ui or pydantic_ai.ag_ui.handle_ag_ui_request instead. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `agent` | `AbstractAgent[AgentDepsT, OutputDataT]` | The agent to run. | *required* | | `output_type` | `OutputSpec[Any] | None` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | `None` | | `model` | `Model | KnownModelName | str | None` | Optional model to use for this run, required if model was not set when creating the agent. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | `ModelSettings | None` | Optional settings to use for this model's request. | `None` | | `usage_limits` | `UsageLimits | None` | Optional limits on model request count or token usage. | `None` | | `usage` | `RunUsage | None` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | `None` | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | None` | Optional additional toolsets for this run. | `None` | | `debug` | `bool` | Boolean indicating if debug tracebacks should be returned on errors. | `False` | | `routes` | `Sequence[BaseRoute] | None` | A list of routes to serve incoming HTTP and WebSocket requests. | `None` | | `middleware` | `Sequence[Middleware] | None` | A list of middleware to run for every request. A starlette application will always automatically include two middleware classes. ServerErrorMiddleware is added as the very outermost middleware, to handle any uncaught errors occurring anywhere in the entire stack. ExceptionMiddleware is added as the very innermost middleware, to deal with handled exception cases occurring in the routing or endpoints. | `None` | | `exception_handlers` | `Mapping[Any, ExceptionHandler] | None` | A mapping of either integer status codes, or exception class types onto callables which handle the exceptions. Exception handler callables should be of the form handler(request, exc) -> response and may be either standard functions, or async functions. | `None` | | `on_startup` | `Sequence[Callable[[], Any]] | None` | A list of callables to run on application startup. Startup handler callables do not take any arguments, and may be either standard functions, or async functions. | `None` | | `on_shutdown` | `Sequence[Callable[[], Any]] | None` | A list of callables to run on application shutdown. Shutdown handler callables do not take any arguments, and may be either standard functions, or async functions. | `None` | | `lifespan` | `Lifespan[AGUIApp[AgentDepsT, OutputDataT]] | None` | A lifespan context function, which can be used to perform startup and shutdown tasks. This is a newer style that replaces the on_startup and on_shutdown handlers. Use one or the other, not both. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/ag_ui.py` ```python def __init__( self, agent: AbstractAgent[AgentDepsT, OutputDataT], *, # Agent.iter parameters. output_type: OutputSpec[Any] | None = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, # Starlette parameters. debug: bool = False, routes: Sequence[BaseRoute] | None = None, middleware: Sequence[Middleware] | None = None, exception_handlers: Mapping[Any, ExceptionHandler] | None = None, on_startup: Sequence[Callable[[], Any]] | None = None, on_shutdown: Sequence[Callable[[], Any]] | None = None, lifespan: Lifespan[AGUIApp[AgentDepsT, OutputDataT]] | None = None, ) -> None: """An ASGI application that handles every AG-UI request by running the agent. Note that the `deps` will be the same for each request, with the exception of the AG-UI state that's injected into the `state` field of a `deps` object that implements the [`StateHandler`][pydantic_ai.ag_ui.StateHandler] protocol. To provide different `deps` for each request (e.g. based on the authenticated user), use [`pydantic_ai.ag_ui.run_ag_ui`][pydantic_ai.ag_ui.run_ag_ui] or [`pydantic_ai.ag_ui.handle_ag_ui_request`][pydantic_ai.ag_ui.handle_ag_ui_request] instead. Args: agent: The agent to run. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. debug: Boolean indicating if debug tracebacks should be returned on errors. routes: A list of routes to serve incoming HTTP and WebSocket requests. middleware: A list of middleware to run for every request. A starlette application will always automatically include two middleware classes. `ServerErrorMiddleware` is added as the very outermost middleware, to handle any uncaught errors occurring anywhere in the entire stack. `ExceptionMiddleware` is added as the very innermost middleware, to deal with handled exception cases occurring in the routing or endpoints. exception_handlers: A mapping of either integer status codes, or exception class types onto callables which handle the exceptions. Exception handler callables should be of the form `handler(request, exc) -> response` and may be either standard functions, or async functions. on_startup: A list of callables to run on application startup. Startup handler callables do not take any arguments, and may be either standard functions, or async functions. on_shutdown: A list of callables to run on application shutdown. Shutdown handler callables do not take any arguments, and may be either standard functions, or async functions. lifespan: A lifespan context function, which can be used to perform startup and shutdown tasks. This is a newer style that replaces the `on_startup` and `on_shutdown` handlers. Use one or the other, not both. """ super().__init__( debug=debug, routes=routes, middleware=middleware, exception_handlers=exception_handlers, on_startup=on_startup, on_shutdown=on_shutdown, lifespan=lifespan, ) async def endpoint(request: Request) -> Response: """Endpoint to run the agent with the provided input data.""" return await handle_ag_ui_request( agent, request, output_type=output_type, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, ) self.router.add_route('/', endpoint, methods=['POST'], name='run_agent') ``` ### handle_ag_ui_request ```python handle_ag_ui_request( agent: AbstractAgent[AgentDepsT, Any], request: Request, *, output_type: OutputSpec[Any] | None = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, on_complete: OnCompleteFunc | None = None ) -> Response ``` Handle an AG-UI request by running the agent and returning a streaming response. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `agent` | `AbstractAgent[AgentDepsT, Any]` | The agent to run. | *required* | | `request` | `Request` | The Starlette request (e.g. from FastAPI) containing the AG-UI run input. | *required* | | `output_type` | `OutputSpec[Any] | None` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | `None` | | `model` | `Model | KnownModelName | str | None` | Optional model to use for this run, required if model was not set when creating the agent. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | `ModelSettings | None` | Optional settings to use for this model's request. | `None` | | `usage_limits` | `UsageLimits | None` | Optional limits on model request count or token usage. | `None` | | `usage` | `RunUsage | None` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | `None` | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | None` | Optional additional toolsets for this run. | `None` | | `on_complete` | `OnCompleteFunc | None` | Optional callback function called when the agent run completes successfully. The callback receives the completed AgentRunResult and can access all_messages() and other result data. | `None` | Returns: | Type | Description | | --- | --- | | `Response` | A streaming Starlette response with AG-UI protocol events. | Source code in `pydantic_ai_slim/pydantic_ai/ag_ui.py` ```python async def handle_ag_ui_request( agent: AbstractAgent[AgentDepsT, Any], request: Request, *, output_type: OutputSpec[Any] | None = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, on_complete: OnCompleteFunc | None = None, ) -> Response: """Handle an AG-UI request by running the agent and returning a streaming response. Args: agent: The agent to run. request: The Starlette request (e.g. from FastAPI) containing the AG-UI run input. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. on_complete: Optional callback function called when the agent run completes successfully. The callback receives the completed [`AgentRunResult`][pydantic_ai.agent.AgentRunResult] and can access `all_messages()` and other result data. Returns: A streaming Starlette response with AG-UI protocol events. """ accept = request.headers.get('accept', SSE_CONTENT_TYPE) try: input_data = RunAgentInput.model_validate(await request.json()) except ValidationError as e: # pragma: no cover return Response( content=json.dumps(e.json()), media_type='application/json', status_code=HTTPStatus.UNPROCESSABLE_ENTITY, ) return StreamingResponse( run_ag_ui( agent, input_data, accept, output_type=output_type, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, on_complete=on_complete, ), media_type=accept, ) ``` ### run_ag_ui ```python run_ag_ui( agent: AbstractAgent[AgentDepsT, Any], run_input: RunAgentInput, accept: str = SSE_CONTENT_TYPE, *, output_type: OutputSpec[Any] | None = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, on_complete: OnCompleteFunc | None = None ) -> AsyncIterator[str] ``` Run the agent with the AG-UI run input and stream AG-UI protocol events. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `agent` | `AbstractAgent[AgentDepsT, Any]` | The agent to run. | *required* | | `run_input` | `RunAgentInput` | The AG-UI run input containing thread_id, run_id, messages, etc. | *required* | | `accept` | `str` | The accept header value for the run. | `SSE_CONTENT_TYPE` | | `output_type` | `OutputSpec[Any] | None` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | `None` | | `model` | `Model | KnownModelName | str | None` | Optional model to use for this run, required if model was not set when creating the agent. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | `ModelSettings | None` | Optional settings to use for this model's request. | `None` | | `usage_limits` | `UsageLimits | None` | Optional limits on model request count or token usage. | `None` | | `usage` | `RunUsage | None` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | `None` | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | None` | Optional additional toolsets for this run. | `None` | | `on_complete` | `OnCompleteFunc | None` | Optional callback function called when the agent run completes successfully. The callback receives the completed AgentRunResult and can access all_messages() and other result data. | `None` | Yields: | Type | Description | | --- | --- | | `AsyncIterator[str]` | Streaming event chunks encoded as strings according to the accept header value. | Source code in `pydantic_ai_slim/pydantic_ai/ag_ui.py` ```python async def run_ag_ui( agent: AbstractAgent[AgentDepsT, Any], run_input: RunAgentInput, accept: str = SSE_CONTENT_TYPE, *, output_type: OutputSpec[Any] | None = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, on_complete: OnCompleteFunc | None = None, ) -> AsyncIterator[str]: """Run the agent with the AG-UI run input and stream AG-UI protocol events. Args: agent: The agent to run. run_input: The AG-UI run input containing thread_id, run_id, messages, etc. accept: The accept header value for the run. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. on_complete: Optional callback function called when the agent run completes successfully. The callback receives the completed [`AgentRunResult`][pydantic_ai.agent.AgentRunResult] and can access `all_messages()` and other result data. Yields: Streaming event chunks encoded as strings according to the accept header value. """ encoder = EventEncoder(accept=accept) if run_input.tools: # AG-UI tools can't be prefixed as that would result in a mismatch between the tool names in the # Pydantic AI events and actual AG-UI tool names, preventing the tool from being called. If any # conflicts arise, the AG-UI tool should be renamed or a `PrefixedToolset` used for local toolsets. toolset = _AGUIFrontendToolset[AgentDepsT](run_input.tools) toolsets = [*toolsets, toolset] if toolsets else [toolset] try: yield encoder.encode( RunStartedEvent( thread_id=run_input.thread_id, run_id=run_input.run_id, ), ) if not run_input.messages: raise _NoMessagesError raw_state: dict[str, Any] = run_input.state or {} if isinstance(deps, StateHandler): if isinstance(deps.state, BaseModel): try: state = type(deps.state).model_validate(raw_state) except ValidationError as e: # pragma: no cover raise _InvalidStateError from e else: state = raw_state deps = replace(deps, state=state) elif raw_state: raise UserError( f'AG-UI state is provided but `deps` of type `{type(deps).__name__}` does not implement the `StateHandler` protocol: it needs to be a dataclass with a non-optional `state` field.' ) else: # `deps` not being a `StateHandler` is OK if there is no state. pass messages = _messages_from_ag_ui(run_input.messages) async with agent.iter( user_prompt=None, output_type=[output_type or agent.output_type, DeferredToolRequests], message_history=messages, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, ) as run: async for event in _agent_stream(run): yield encoder.encode(event) if on_complete is not None and run.result is not None: if _utils.is_async_callable(on_complete): await on_complete(run.result) else: await _utils.run_in_executor(on_complete, run.result) except _RunError as e: yield encoder.encode( RunErrorEvent(message=e.message, code=e.code), ) except Exception as e: yield encoder.encode( RunErrorEvent(message=str(e)), ) raise e else: yield encoder.encode( RunFinishedEvent( thread_id=run_input.thread_id, run_id=run_input.run_id, ), ) ``` ### StateHandler Bases: `Protocol` Protocol for state handlers in agent runs. Requires the class to be a dataclass with a `state` field. Source code in `pydantic_ai_slim/pydantic_ai/ag_ui.py` ```python @runtime_checkable class StateHandler(Protocol): """Protocol for state handlers in agent runs. Requires the class to be a dataclass with a `state` field.""" # Has to be a dataclass so we can use `replace` to update the state. # From https://github.com/python/typeshed/blob/9ab7fde0a0cd24ed7a72837fcb21093b811b80d8/stdlib/_typeshed/__init__.pyi#L352 __dataclass_fields__: ClassVar[dict[str, Field[Any]]] @property def state(self) -> State: """Get the current state of the agent run.""" ... @state.setter def state(self, state: State) -> None: """Set the state of the agent run. This method is called to update the state of the agent run with the provided state. Args: state: The run state. Raises: InvalidStateError: If `state` does not match the expected model. """ ... ``` #### state ```python state: State ``` Get the current state of the agent run. ### StateDeps Bases: `Generic[StateT]` Provides AG-UI state management. This class is used to manage the state of an agent run. It allows setting the state of the agent run with a specific type of state model, which must be a subclass of `BaseModel`. The state is set using the `state` setter by the `Adapter` when the run starts. Implements the `StateHandler` protocol. Source code in `pydantic_ai_slim/pydantic_ai/ag_ui.py` ```python @dataclass class StateDeps(Generic[StateT]): """Provides AG-UI state management. This class is used to manage the state of an agent run. It allows setting the state of the agent run with a specific type of state model, which must be a subclass of `BaseModel`. The state is set using the `state` setter by the `Adapter` when the run starts. Implements the `StateHandler` protocol. """ state: StateT ``` # `pydantic_ai.agent` ### Agent Bases: `AbstractAgent[AgentDepsT, OutputDataT]` Class for defining "agents" - a way to have a specific type of "conversation" with an LLM. Agents are generic in the dependency type they take AgentDepsT and the output type they return, OutputDataT. By default, if neither generic parameter is customised, agents have type `Agent[None, str]`. Minimal usage example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') result = agent.run_sync('What is the capital of France?') print(result.output) #> The capital of France is Paris. ``` Source code in `pydantic_ai_slim/pydantic_ai/agent/__init__.py` ````python @dataclasses.dataclass(init=False) class Agent(AbstractAgent[AgentDepsT, OutputDataT]): """Class for defining "agents" - a way to have a specific type of "conversation" with an LLM. Agents are generic in the dependency type they take [`AgentDepsT`][pydantic_ai.tools.AgentDepsT] and the output type they return, [`OutputDataT`][pydantic_ai.output.OutputDataT]. By default, if neither generic parameter is customised, agents have type `Agent[None, str]`. Minimal usage example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') result = agent.run_sync('What is the capital of France?') print(result.output) #> The capital of France is Paris. ``` """ _model: models.Model | models.KnownModelName | str | None _name: str | None end_strategy: EndStrategy """Strategy for handling tool calls when a final result is found.""" model_settings: ModelSettings | None """Optional model request settings to use for this agents's runs, by default. Note, if `model_settings` is provided by `run`, `run_sync`, or `run_stream`, those settings will be merged with this value, with the runtime argument taking priority. """ _output_type: OutputSpec[OutputDataT] instrument: InstrumentationSettings | bool | None """Options to automatically instrument with OpenTelemetry.""" _instrument_default: ClassVar[InstrumentationSettings | bool] = False _deps_type: type[AgentDepsT] = dataclasses.field(repr=False) _output_schema: _output.BaseOutputSchema[OutputDataT] = dataclasses.field(repr=False) _output_validators: list[_output.OutputValidator[AgentDepsT, OutputDataT]] = dataclasses.field(repr=False) _instructions: str | None = dataclasses.field(repr=False) _instructions_functions: list[_system_prompt.SystemPromptRunner[AgentDepsT]] = dataclasses.field(repr=False) _system_prompts: tuple[str, ...] = dataclasses.field(repr=False) _system_prompt_functions: list[_system_prompt.SystemPromptRunner[AgentDepsT]] = dataclasses.field(repr=False) _system_prompt_dynamic_functions: dict[str, _system_prompt.SystemPromptRunner[AgentDepsT]] = dataclasses.field( repr=False ) _function_toolset: FunctionToolset[AgentDepsT] = dataclasses.field(repr=False) _output_toolset: OutputToolset[AgentDepsT] | None = dataclasses.field(repr=False) _user_toolsets: list[AbstractToolset[AgentDepsT]] = dataclasses.field(repr=False) _prepare_tools: ToolsPrepareFunc[AgentDepsT] | None = dataclasses.field(repr=False) _prepare_output_tools: ToolsPrepareFunc[AgentDepsT] | None = dataclasses.field(repr=False) _max_result_retries: int = dataclasses.field(repr=False) _max_tool_retries: int = dataclasses.field(repr=False) _event_stream_handler: EventStreamHandler[AgentDepsT] | None = dataclasses.field(repr=False) _enter_lock: Lock = dataclasses.field(repr=False) _entered_count: int = dataclasses.field(repr=False) _exit_stack: AsyncExitStack | None = dataclasses.field(repr=False) @overload def __init__( self, model: models.Model | models.KnownModelName | str | None = None, *, output_type: OutputSpec[OutputDataT] = str, instructions: str | _system_prompt.SystemPromptFunc[AgentDepsT] | Sequence[str | _system_prompt.SystemPromptFunc[AgentDepsT]] | None = None, system_prompt: str | Sequence[str] = (), deps_type: type[AgentDepsT] = NoneType, name: str | None = None, model_settings: ModelSettings | None = None, retries: int = 1, output_retries: int | None = None, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] = (), builtin_tools: Sequence[AbstractBuiltinTool] = (), prepare_tools: ToolsPrepareFunc[AgentDepsT] | None = None, prepare_output_tools: ToolsPrepareFunc[AgentDepsT] | None = None, toolsets: Sequence[AbstractToolset[AgentDepsT] | ToolsetFunc[AgentDepsT]] | None = None, defer_model_check: bool = False, end_strategy: EndStrategy = 'early', instrument: InstrumentationSettings | bool | None = None, history_processors: Sequence[HistoryProcessor[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> None: ... @overload @deprecated('`mcp_servers` is deprecated, use `toolsets` instead.') def __init__( self, model: models.Model | models.KnownModelName | str | None = None, *, output_type: OutputSpec[OutputDataT] = str, instructions: str | _system_prompt.SystemPromptFunc[AgentDepsT] | Sequence[str | _system_prompt.SystemPromptFunc[AgentDepsT]] | None = None, system_prompt: str | Sequence[str] = (), deps_type: type[AgentDepsT] = NoneType, name: str | None = None, model_settings: ModelSettings | None = None, retries: int = 1, output_retries: int | None = None, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] = (), builtin_tools: Sequence[AbstractBuiltinTool] = (), prepare_tools: ToolsPrepareFunc[AgentDepsT] | None = None, prepare_output_tools: ToolsPrepareFunc[AgentDepsT] | None = None, mcp_servers: Sequence[MCPServer] = (), defer_model_check: bool = False, end_strategy: EndStrategy = 'early', instrument: InstrumentationSettings | bool | None = None, history_processors: Sequence[HistoryProcessor[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> None: ... def __init__( self, model: models.Model | models.KnownModelName | str | None = None, *, output_type: OutputSpec[OutputDataT] = str, instructions: str | _system_prompt.SystemPromptFunc[AgentDepsT] | Sequence[str | _system_prompt.SystemPromptFunc[AgentDepsT]] | None = None, system_prompt: str | Sequence[str] = (), deps_type: type[AgentDepsT] = NoneType, name: str | None = None, model_settings: ModelSettings | None = None, retries: int = 1, output_retries: int | None = None, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] = (), builtin_tools: Sequence[AbstractBuiltinTool] = (), prepare_tools: ToolsPrepareFunc[AgentDepsT] | None = None, prepare_output_tools: ToolsPrepareFunc[AgentDepsT] | None = None, toolsets: Sequence[AbstractToolset[AgentDepsT] | ToolsetFunc[AgentDepsT]] | None = None, defer_model_check: bool = False, end_strategy: EndStrategy = 'early', instrument: InstrumentationSettings | bool | None = None, history_processors: Sequence[HistoryProcessor[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Any, ): """Create an agent. Args: model: The default model to use for this agent, if not provided, you must provide the model when calling it. We allow `str` here since the actual list of allowed models changes frequently. output_type: The type of the output data, used to validate the data returned by the model, defaults to `str`. instructions: Instructions to use for this agent, you can also register instructions via a function with [`instructions`][pydantic_ai.Agent.instructions]. system_prompt: Static system prompts to use for this agent, you can also register system prompts via a function with [`system_prompt`][pydantic_ai.Agent.system_prompt]. deps_type: The type used for dependency injection, this parameter exists solely to allow you to fully parameterize the agent, and therefore get the best out of static type checking. If you're not using deps, but want type checking to pass, you can set `deps=None` to satisfy Pyright or add a type hint `: Agent[None, ]`. name: The name of the agent, used for logging. If `None`, we try to infer the agent name from the call frame when the agent is first run. model_settings: Optional model request settings to use for this agent's runs, by default. retries: The default number of retries to allow before raising an error. output_retries: The maximum number of retries to allow for output validation, defaults to `retries`. tools: Tools to register with the agent, you can also register tools via the decorators [`@agent.tool`][pydantic_ai.Agent.tool] and [`@agent.tool_plain`][pydantic_ai.Agent.tool_plain]. builtin_tools: The builtin tools that the agent will use. This depends on the model, as some models may not support certain tools. If the model doesn't support the builtin tools, an error will be raised. prepare_tools: Custom function to prepare the tool definition of all tools for each step, except output tools. This is useful if you want to customize the definition of multiple tools or you want to register a subset of tools for a given step. See [`ToolsPrepareFunc`][pydantic_ai.tools.ToolsPrepareFunc] prepare_output_tools: Custom function to prepare the tool definition of all output tools for each step. This is useful if you want to customize the definition of multiple output tools or you want to register a subset of output tools for a given step. See [`ToolsPrepareFunc`][pydantic_ai.tools.ToolsPrepareFunc] toolsets: Toolsets to register with the agent, including MCP servers and functions which take a run context and return a toolset. See [`ToolsetFunc`][pydantic_ai.toolsets.ToolsetFunc] for more information. defer_model_check: by default, if you provide a [named][pydantic_ai.models.KnownModelName] model, it's evaluated to create a [`Model`][pydantic_ai.models.Model] instance immediately, which checks for the necessary environment variables. Set this to `false` to defer the evaluation until the first run. Useful if you want to [override the model][pydantic_ai.Agent.override] for testing. end_strategy: Strategy for handling tool calls that are requested alongside a final result. See [`EndStrategy`][pydantic_ai.agent.EndStrategy] for more information. instrument: Set to True to automatically instrument with OpenTelemetry, which will use Logfire if it's configured. Set to an instance of [`InstrumentationSettings`][pydantic_ai.agent.InstrumentationSettings] to customize. If this isn't set, then the last value set by [`Agent.instrument_all()`][pydantic_ai.Agent.instrument_all] will be used, which defaults to False. See the [Debugging and Monitoring guide](https://ai.pydantic.dev/logfire/) for more info. history_processors: Optional list of callables to process the message history before sending it to the model. Each processor takes a list of messages and returns a modified list of messages. Processors can be sync or async and are applied in sequence. event_stream_handler: Optional handler for events from the model's streaming response and the agent's execution of tools. """ if model is None or defer_model_check: self._model = model else: self._model = models.infer_model(model) self._name = name self.end_strategy = end_strategy self.model_settings = model_settings self._output_type = output_type self.instrument = instrument self._deps_type = deps_type if mcp_servers := _deprecated_kwargs.pop('mcp_servers', None): if toolsets is not None: # pragma: no cover raise TypeError('`mcp_servers` and `toolsets` cannot be set at the same time.') warnings.warn('`mcp_servers` is deprecated, use `toolsets` instead', DeprecationWarning) toolsets = mcp_servers _utils.validate_empty_kwargs(_deprecated_kwargs) default_output_mode = ( self.model.profile.default_structured_output_mode if isinstance(self.model, models.Model) else None ) self._output_schema = _output.OutputSchema[OutputDataT].build(output_type, default_mode=default_output_mode) self._output_validators = [] self._instructions = '' self._instructions_functions = [] if isinstance(instructions, str | Callable): instructions = [instructions] for instruction in instructions or []: if isinstance(instruction, str): self._instructions += instruction + '\n' else: self._instructions_functions.append(_system_prompt.SystemPromptRunner(instruction)) self._instructions = self._instructions.strip() or None self._system_prompts = (system_prompt,) if isinstance(system_prompt, str) else tuple(system_prompt) self._system_prompt_functions = [] self._system_prompt_dynamic_functions = {} self._max_result_retries = output_retries if output_retries is not None else retries self._max_tool_retries = retries self._builtin_tools = builtin_tools self._prepare_tools = prepare_tools self._prepare_output_tools = prepare_output_tools self._output_toolset = self._output_schema.toolset if self._output_toolset: self._output_toolset.max_retries = self._max_result_retries self._function_toolset = _AgentFunctionToolset( tools, max_retries=self._max_tool_retries, output_schema=self._output_schema ) self._dynamic_toolsets = [ DynamicToolset[AgentDepsT](toolset_func=toolset) for toolset in toolsets or [] if not isinstance(toolset, AbstractToolset) ] self._user_toolsets = [toolset for toolset in toolsets or [] if isinstance(toolset, AbstractToolset)] self.history_processors = history_processors or [] self._event_stream_handler = event_stream_handler self._override_deps: ContextVar[_utils.Option[AgentDepsT]] = ContextVar('_override_deps', default=None) self._override_model: ContextVar[_utils.Option[models.Model]] = ContextVar('_override_model', default=None) self._override_toolsets: ContextVar[_utils.Option[Sequence[AbstractToolset[AgentDepsT]]]] = ContextVar( '_override_toolsets', default=None ) self._override_tools: ContextVar[ _utils.Option[Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]]] ] = ContextVar('_override_tools', default=None) self._enter_lock = Lock() self._entered_count = 0 self._exit_stack = None @staticmethod def instrument_all(instrument: InstrumentationSettings | bool = True) -> None: """Set the instrumentation options for all agents where `instrument` is not set.""" Agent._instrument_default = instrument @property def model(self) -> models.Model | models.KnownModelName | str | None: """The default model configured for this agent.""" return self._model @model.setter def model(self, value: models.Model | models.KnownModelName | str | None) -> None: """Set the default model configured for this agent. We allow `str` here since the actual list of allowed models changes frequently. """ self._model = value @property def name(self) -> str | None: """The name of the agent, used for logging. If `None`, we try to infer the agent name from the call frame when the agent is first run. """ return self._name @name.setter def name(self, value: str | None) -> None: """Set the name of the agent, used for logging.""" self._name = value @property def deps_type(self) -> type: """The type of dependencies used by the agent.""" return self._deps_type @property def output_type(self) -> OutputSpec[OutputDataT]: """The type of data output by agent runs, used to validate the data returned by the model, defaults to `str`.""" return self._output_type @property def event_stream_handler(self) -> EventStreamHandler[AgentDepsT] | None: """Optional handler for events from the model's streaming response and the agent's execution of tools.""" return self._event_stream_handler def __repr__(self) -> str: return f'{type(self).__name__}(model={self.model!r}, name={self.name!r}, end_strategy={self.end_strategy!r}, model_settings={self.model_settings!r}, output_type={self.output_type!r}, instrument={self.instrument!r})' @overload def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, ) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, OutputDataT]]: ... @overload def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, ) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, RunOutputDataT]]: ... @asynccontextmanager async def iter( # noqa: C901 self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, ) -> AsyncIterator[AgentRun[AgentDepsT, Any]]: """A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions=None, instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. Returns: The result of the run. """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) model_used = self._get_model(model) del model deps = self._get_deps(deps) new_message_index = len(message_history) if message_history else 0 output_schema = self._prepare_output_schema(output_type, model_used.profile) output_type_ = output_type or self.output_type # We consider it a user error if a user tries to restrict the result type while having an output validator that # may change the result type from the restricted type to something else. Therefore, we consider the following # typecast reasonable, even though it is possible to violate it with otherwise-type-checked code. output_validators = cast(list[_output.OutputValidator[AgentDepsT, RunOutputDataT]], self._output_validators) output_toolset = self._output_toolset if output_schema != self._output_schema or output_validators: output_toolset = cast(OutputToolset[AgentDepsT], output_schema.toolset) if output_toolset: output_toolset.max_retries = self._max_result_retries output_toolset.output_validators = output_validators toolset = self._get_toolset(output_toolset=output_toolset, additional_toolsets=toolsets) tool_manager = ToolManager[AgentDepsT](toolset) # Build the graph graph: Graph[_agent_graph.GraphAgentState, _agent_graph.GraphAgentDeps[AgentDepsT, Any], FinalResult[Any]] = ( _agent_graph.build_agent_graph(self.name, self._deps_type, output_type_) ) # Build the initial state usage = usage or _usage.RunUsage() state = _agent_graph.GraphAgentState( message_history=message_history[:] if message_history else [], usage=usage, retries=0, run_step=0, ) # Merge model settings in order of precedence: run > agent > model merged_settings = merge_model_settings(model_used.settings, self.model_settings) model_settings = merge_model_settings(merged_settings, model_settings) usage_limits = usage_limits or _usage.UsageLimits() async def get_instructions(run_context: RunContext[AgentDepsT]) -> str | None: parts = [ self._instructions, *[await func.run(run_context) for func in self._instructions_functions], ] model_profile = model_used.profile if isinstance(output_schema, _output.PromptedOutputSchema): instructions = output_schema.instructions(model_profile.prompted_output_template) parts.append(instructions) parts = [p for p in parts if p] if not parts: return None return '\n\n'.join(parts).strip() if isinstance(model_used, InstrumentedModel): instrumentation_settings = model_used.instrumentation_settings tracer = model_used.instrumentation_settings.tracer else: instrumentation_settings = None tracer = NoOpTracer() tool_call_results: dict[str, DeferredToolResult] | None = None if deferred_tool_results is not None: tool_call_results = {} for tool_call_id, approval in deferred_tool_results.approvals.items(): if approval is True: approval = ToolApproved() elif approval is False: approval = ToolDenied() tool_call_results[tool_call_id] = approval if calls := deferred_tool_results.calls: call_result_types = _utils.get_union_args(DeferredToolCallResult) for tool_call_id, result in calls.items(): if not isinstance(result, call_result_types): result = _messages.ToolReturn(result) tool_call_results[tool_call_id] = result graph_deps = _agent_graph.GraphAgentDeps[AgentDepsT, RunOutputDataT]( user_deps=deps, prompt=user_prompt, new_message_index=new_message_index, model=model_used, model_settings=model_settings, usage_limits=usage_limits, max_result_retries=self._max_result_retries, end_strategy=self.end_strategy, output_schema=output_schema, output_validators=output_validators, history_processors=self.history_processors, builtin_tools=list(self._builtin_tools), tool_manager=tool_manager, tool_call_results=tool_call_results, tracer=tracer, get_instructions=get_instructions, instrumentation_settings=instrumentation_settings, ) start_node = _agent_graph.UserPromptNode[AgentDepsT]( user_prompt=user_prompt, instructions=self._instructions, instructions_functions=self._instructions_functions, system_prompts=self._system_prompts, system_prompt_functions=self._system_prompt_functions, system_prompt_dynamic_functions=self._system_prompt_dynamic_functions, ) agent_name = self.name or 'agent' run_span = tracer.start_span( 'agent run', attributes={ 'model_name': model_used.model_name if model_used else 'no-model', 'agent_name': agent_name, 'logfire.msg': f'{agent_name} run', }, ) try: async with toolset: async with graph.iter( start_node, state=state, deps=graph_deps, span=use_span(run_span) if run_span.is_recording() else None, infer_name=False, ) as graph_run: agent_run = AgentRun(graph_run) yield agent_run if (final_result := agent_run.result) is not None and run_span.is_recording(): if instrumentation_settings and instrumentation_settings.include_content: run_span.set_attribute( 'final_result', ( final_result.output if isinstance(final_result.output, str) else json.dumps(InstrumentedModel.serialize_any(final_result.output)) ), ) finally: try: if instrumentation_settings and run_span.is_recording(): run_span.set_attributes(self._run_span_end_attributes(state, usage, instrumentation_settings)) finally: run_span.end() def _run_span_end_attributes( self, state: _agent_graph.GraphAgentState, usage: _usage.RunUsage, settings: InstrumentationSettings ): if settings.version == 1: attrs = { 'all_messages_events': json.dumps( [ InstrumentedModel.event_to_dict(e) for e in settings.messages_to_otel_events(state.message_history) ] ) } else: attrs = { 'pydantic_ai.all_messages': json.dumps(settings.messages_to_otel_messages(state.message_history)), **settings.system_instructions_attributes(self._instructions), } return { **usage.opentelemetry_attributes(), **attrs, 'logfire.json_schema': json.dumps( { 'type': 'object', 'properties': { **{attr: {'type': 'array'} for attr in attrs.keys()}, 'final_result': {'type': 'object'}, }, } ), } @contextmanager def override( self, *, deps: AgentDepsT | _utils.Unset = _utils.UNSET, model: models.Model | models.KnownModelName | str | _utils.Unset = _utils.UNSET, toolsets: Sequence[AbstractToolset[AgentDepsT]] | _utils.Unset = _utils.UNSET, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | _utils.Unset = _utils.UNSET, ) -> Iterator[None]: """Context manager to temporarily override agent dependencies, model, toolsets, or tools. This is particularly useful when testing. You can find an example of this [here](../testing.md#overriding-model-via-pytest-fixtures). Args: deps: The dependencies to use instead of the dependencies passed to the agent run. model: The model to use instead of the model passed to the agent run. toolsets: The toolsets to use instead of the toolsets passed to the agent constructor and agent run. tools: The tools to use instead of the tools registered with the agent. """ if _utils.is_set(deps): deps_token = self._override_deps.set(_utils.Some(deps)) else: deps_token = None if _utils.is_set(model): model_token = self._override_model.set(_utils.Some(models.infer_model(model))) else: model_token = None if _utils.is_set(toolsets): toolsets_token = self._override_toolsets.set(_utils.Some(toolsets)) else: toolsets_token = None if _utils.is_set(tools): tools_token = self._override_tools.set(_utils.Some(tools)) else: tools_token = None try: yield finally: if deps_token is not None: self._override_deps.reset(deps_token) if model_token is not None: self._override_model.reset(model_token) if toolsets_token is not None: self._override_toolsets.reset(toolsets_token) if tools_token is not None: self._override_tools.reset(tools_token) @overload def instructions( self, func: Callable[[RunContext[AgentDepsT]], str], / ) -> Callable[[RunContext[AgentDepsT]], str]: ... @overload def instructions( self, func: Callable[[RunContext[AgentDepsT]], Awaitable[str]], / ) -> Callable[[RunContext[AgentDepsT]], Awaitable[str]]: ... @overload def instructions(self, func: Callable[[], str], /) -> Callable[[], str]: ... @overload def instructions(self, func: Callable[[], Awaitable[str]], /) -> Callable[[], Awaitable[str]]: ... @overload def instructions( self, / ) -> Callable[[_system_prompt.SystemPromptFunc[AgentDepsT]], _system_prompt.SystemPromptFunc[AgentDepsT]]: ... def instructions( self, func: _system_prompt.SystemPromptFunc[AgentDepsT] | None = None, /, ) -> ( Callable[[_system_prompt.SystemPromptFunc[AgentDepsT]], _system_prompt.SystemPromptFunc[AgentDepsT]] | _system_prompt.SystemPromptFunc[AgentDepsT] ): """Decorator to register an instructions function. Optionally takes [`RunContext`][pydantic_ai.tools.RunContext] as its only argument. Can decorate a sync or async functions. The decorator can be used bare (`agent.instructions`). Overloads for every possible signature of `instructions` are included so the decorator doesn't obscure the type of the function. Example: ```python from pydantic_ai import Agent, RunContext agent = Agent('test', deps_type=str) @agent.instructions def simple_instructions() -> str: return 'foobar' @agent.instructions async def async_instructions(ctx: RunContext[str]) -> str: return f'{ctx.deps} is the best' ``` """ if func is None: def decorator( func_: _system_prompt.SystemPromptFunc[AgentDepsT], ) -> _system_prompt.SystemPromptFunc[AgentDepsT]: self._instructions_functions.append(_system_prompt.SystemPromptRunner(func_)) return func_ return decorator else: self._instructions_functions.append(_system_prompt.SystemPromptRunner(func)) return func @overload def system_prompt( self, func: Callable[[RunContext[AgentDepsT]], str], / ) -> Callable[[RunContext[AgentDepsT]], str]: ... @overload def system_prompt( self, func: Callable[[RunContext[AgentDepsT]], Awaitable[str]], / ) -> Callable[[RunContext[AgentDepsT]], Awaitable[str]]: ... @overload def system_prompt(self, func: Callable[[], str], /) -> Callable[[], str]: ... @overload def system_prompt(self, func: Callable[[], Awaitable[str]], /) -> Callable[[], Awaitable[str]]: ... @overload def system_prompt( self, /, *, dynamic: bool = False ) -> Callable[[_system_prompt.SystemPromptFunc[AgentDepsT]], _system_prompt.SystemPromptFunc[AgentDepsT]]: ... def system_prompt( self, func: _system_prompt.SystemPromptFunc[AgentDepsT] | None = None, /, *, dynamic: bool = False, ) -> ( Callable[[_system_prompt.SystemPromptFunc[AgentDepsT]], _system_prompt.SystemPromptFunc[AgentDepsT]] | _system_prompt.SystemPromptFunc[AgentDepsT] ): """Decorator to register a system prompt function. Optionally takes [`RunContext`][pydantic_ai.tools.RunContext] as its only argument. Can decorate a sync or async functions. The decorator can be used either bare (`agent.system_prompt`) or as a function call (`agent.system_prompt(...)`), see the examples below. Overloads for every possible signature of `system_prompt` are included so the decorator doesn't obscure the type of the function, see `tests/typed_agent.py` for tests. Args: func: The function to decorate dynamic: If True, the system prompt will be reevaluated even when `messages_history` is provided, see [`SystemPromptPart.dynamic_ref`][pydantic_ai.messages.SystemPromptPart.dynamic_ref] Example: ```python from pydantic_ai import Agent, RunContext agent = Agent('test', deps_type=str) @agent.system_prompt def simple_system_prompt() -> str: return 'foobar' @agent.system_prompt(dynamic=True) async def async_system_prompt(ctx: RunContext[str]) -> str: return f'{ctx.deps} is the best' ``` """ if func is None: def decorator( func_: _system_prompt.SystemPromptFunc[AgentDepsT], ) -> _system_prompt.SystemPromptFunc[AgentDepsT]: runner = _system_prompt.SystemPromptRunner[AgentDepsT](func_, dynamic=dynamic) self._system_prompt_functions.append(runner) if dynamic: # pragma: lax no cover self._system_prompt_dynamic_functions[func_.__qualname__] = runner return func_ return decorator else: assert not dynamic, "dynamic can't be True in this case" self._system_prompt_functions.append(_system_prompt.SystemPromptRunner[AgentDepsT](func, dynamic=dynamic)) return func @overload def output_validator( self, func: Callable[[RunContext[AgentDepsT], OutputDataT], OutputDataT], / ) -> Callable[[RunContext[AgentDepsT], OutputDataT], OutputDataT]: ... @overload def output_validator( self, func: Callable[[RunContext[AgentDepsT], OutputDataT], Awaitable[OutputDataT]], / ) -> Callable[[RunContext[AgentDepsT], OutputDataT], Awaitable[OutputDataT]]: ... @overload def output_validator( self, func: Callable[[OutputDataT], OutputDataT], / ) -> Callable[[OutputDataT], OutputDataT]: ... @overload def output_validator( self, func: Callable[[OutputDataT], Awaitable[OutputDataT]], / ) -> Callable[[OutputDataT], Awaitable[OutputDataT]]: ... def output_validator( self, func: _output.OutputValidatorFunc[AgentDepsT, OutputDataT], / ) -> _output.OutputValidatorFunc[AgentDepsT, OutputDataT]: """Decorator to register an output validator function. Optionally takes [`RunContext`][pydantic_ai.tools.RunContext] as its first argument. Can decorate a sync or async functions. Overloads for every possible signature of `output_validator` are included so the decorator doesn't obscure the type of the function, see `tests/typed_agent.py` for tests. Example: ```python from pydantic_ai import Agent, ModelRetry, RunContext agent = Agent('test', deps_type=str) @agent.output_validator def output_validator_simple(data: str) -> str: if 'wrong' in data: raise ModelRetry('wrong response') return data @agent.output_validator async def output_validator_deps(ctx: RunContext[str], data: str) -> str: if ctx.deps in data: raise ModelRetry('wrong response') return data result = agent.run_sync('foobar', deps='spam') print(result.output) #> success (no tool calls) ``` """ self._output_validators.append(_output.OutputValidator[AgentDepsT, Any](func)) return func @overload def tool(self, func: ToolFuncContext[AgentDepsT, ToolParams], /) -> ToolFuncContext[AgentDepsT, ToolParams]: ... @overload def tool( self, /, *, name: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat = 'auto', require_parameter_descriptions: bool = False, schema_generator: type[GenerateJsonSchema] = GenerateToolJsonSchema, strict: bool | None = None, requires_approval: bool = False, ) -> Callable[[ToolFuncContext[AgentDepsT, ToolParams]], ToolFuncContext[AgentDepsT, ToolParams]]: ... def tool( self, func: ToolFuncContext[AgentDepsT, ToolParams] | None = None, /, *, name: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat = 'auto', require_parameter_descriptions: bool = False, schema_generator: type[GenerateJsonSchema] = GenerateToolJsonSchema, strict: bool | None = None, requires_approval: bool = False, ) -> Any: """Decorator to register a tool function which takes [`RunContext`][pydantic_ai.tools.RunContext] as its first argument. Can decorate a sync or async functions. The docstring is inspected to extract both the tool description and description of each parameter, [learn more](../tools.md#function-tools-and-schema). We can't add overloads for every possible signature of tool, since the return type is a recursive union so the signature of functions decorated with `@agent.tool` is obscured. Example: ```python from pydantic_ai import Agent, RunContext agent = Agent('test', deps_type=int) @agent.tool def foobar(ctx: RunContext[int], x: int) -> int: return ctx.deps + x @agent.tool(retries=2) async def spam(ctx: RunContext[str], y: float) -> float: return ctx.deps + y result = agent.run_sync('foobar', deps=1) print(result.output) #> {"foobar":1,"spam":1.0} ``` Args: func: The tool function to register. name: The name of the tool, defaults to the function name. retries: The number of retries to allow for this tool, defaults to the agent's default retries, which defaults to 1. prepare: custom method to prepare the tool definition for each step, return `None` to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See [`ToolPrepareFunc`][pydantic_ai.tools.ToolPrepareFunc]. docstring_format: The format of the docstring, see [`DocstringFormat`][pydantic_ai.tools.DocstringFormat]. Defaults to `'auto'`, such that the format is inferred from the structure of the docstring. require_parameter_descriptions: If True, raise an error if a parameter description is missing. Defaults to False. schema_generator: The JSON schema generator class to use for this tool. Defaults to `GenerateToolJsonSchema`. strict: Whether to enforce JSON schema compliance (only affects OpenAI). See [`ToolDefinition`][pydantic_ai.tools.ToolDefinition] for more info. requires_approval: Whether this tool requires human-in-the-loop approval. Defaults to False. See the [tools documentation](../deferred-tools.md#human-in-the-loop-tool-approval) for more info. """ def tool_decorator( func_: ToolFuncContext[AgentDepsT, ToolParams], ) -> ToolFuncContext[AgentDepsT, ToolParams]: # noinspection PyTypeChecker self._function_toolset.add_function( func_, True, name, retries, prepare, docstring_format, require_parameter_descriptions, schema_generator, strict, requires_approval, ) return func_ return tool_decorator if func is None else tool_decorator(func) @overload def tool_plain(self, func: ToolFuncPlain[ToolParams], /) -> ToolFuncPlain[ToolParams]: ... @overload def tool_plain( self, /, *, name: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat = 'auto', require_parameter_descriptions: bool = False, schema_generator: type[GenerateJsonSchema] = GenerateToolJsonSchema, strict: bool | None = None, requires_approval: bool = False, ) -> Callable[[ToolFuncPlain[ToolParams]], ToolFuncPlain[ToolParams]]: ... def tool_plain( self, func: ToolFuncPlain[ToolParams] | None = None, /, *, name: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat = 'auto', require_parameter_descriptions: bool = False, schema_generator: type[GenerateJsonSchema] = GenerateToolJsonSchema, strict: bool | None = None, sequential: bool = False, requires_approval: bool = False, ) -> Any: """Decorator to register a tool function which DOES NOT take `RunContext` as an argument. Can decorate a sync or async functions. The docstring is inspected to extract both the tool description and description of each parameter, [learn more](../tools.md#function-tools-and-schema). We can't add overloads for every possible signature of tool, since the return type is a recursive union so the signature of functions decorated with `@agent.tool` is obscured. Example: ```python from pydantic_ai import Agent, RunContext agent = Agent('test') @agent.tool def foobar(ctx: RunContext[int]) -> int: return 123 @agent.tool(retries=2) async def spam(ctx: RunContext[str]) -> float: return 3.14 result = agent.run_sync('foobar', deps=1) print(result.output) #> {"foobar":123,"spam":3.14} ``` Args: func: The tool function to register. name: The name of the tool, defaults to the function name. retries: The number of retries to allow for this tool, defaults to the agent's default retries, which defaults to 1. prepare: custom method to prepare the tool definition for each step, return `None` to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See [`ToolPrepareFunc`][pydantic_ai.tools.ToolPrepareFunc]. docstring_format: The format of the docstring, see [`DocstringFormat`][pydantic_ai.tools.DocstringFormat]. Defaults to `'auto'`, such that the format is inferred from the structure of the docstring. require_parameter_descriptions: If True, raise an error if a parameter description is missing. Defaults to False. schema_generator: The JSON schema generator class to use for this tool. Defaults to `GenerateToolJsonSchema`. strict: Whether to enforce JSON schema compliance (only affects OpenAI). See [`ToolDefinition`][pydantic_ai.tools.ToolDefinition] for more info. sequential: Whether the function requires a sequential/serial execution environment. Defaults to False. requires_approval: Whether this tool requires human-in-the-loop approval. Defaults to False. See the [tools documentation](../deferred-tools.md#human-in-the-loop-tool-approval) for more info. """ def tool_decorator(func_: ToolFuncPlain[ToolParams]) -> ToolFuncPlain[ToolParams]: # noinspection PyTypeChecker self._function_toolset.add_function( func_, False, name, retries, prepare, docstring_format, require_parameter_descriptions, schema_generator, strict, sequential, requires_approval, ) return func_ return tool_decorator if func is None else tool_decorator(func) @overload def toolset(self, func: ToolsetFunc[AgentDepsT], /) -> ToolsetFunc[AgentDepsT]: ... @overload def toolset( self, /, *, per_run_step: bool = True, ) -> Callable[[ToolsetFunc[AgentDepsT]], ToolsetFunc[AgentDepsT]]: ... def toolset( self, func: ToolsetFunc[AgentDepsT] | None = None, /, *, per_run_step: bool = True, ) -> Any: """Decorator to register a toolset function which takes [`RunContext`][pydantic_ai.tools.RunContext] as its only argument. Can decorate a sync or async functions. The decorator can be used bare (`agent.toolset`). Example: ```python from pydantic_ai import Agent, RunContext from pydantic_ai.toolsets import AbstractToolset, FunctionToolset agent = Agent('test', deps_type=str) @agent.toolset async def simple_toolset(ctx: RunContext[str]) -> AbstractToolset[str]: return FunctionToolset() ``` Args: func: The toolset function to register. per_run_step: Whether to re-evaluate the toolset for each run step. Defaults to True. """ def toolset_decorator(func_: ToolsetFunc[AgentDepsT]) -> ToolsetFunc[AgentDepsT]: self._dynamic_toolsets.append(DynamicToolset(func_, per_run_step=per_run_step)) return func_ return toolset_decorator if func is None else toolset_decorator(func) def _get_model(self, model: models.Model | models.KnownModelName | str | None) -> models.Model: """Create a model configured for this agent. Args: model: model to use for this run, required if `model` was not set when creating the agent. Returns: The model used """ model_: models.Model if some_model := self._override_model.get(): # we don't want `override()` to cover up errors from the model not being defined, hence this check if model is None and self.model is None: raise exceptions.UserError( '`model` must either be set on the agent or included when calling it. ' '(Even when `override(model=...)` is customizing the model that will actually be called)' ) model_ = some_model.value elif model is not None: model_ = models.infer_model(model) elif self.model is not None: # noinspection PyTypeChecker model_ = self.model = models.infer_model(self.model) else: raise exceptions.UserError('`model` must either be set on the agent or included when calling it.') instrument = self.instrument if instrument is None: instrument = self._instrument_default return instrument_model(model_, instrument) def _get_deps(self: Agent[T, OutputDataT], deps: T) -> T: """Get deps for a run. If we've overridden deps via `_override_deps`, use that, otherwise use the deps passed to the call. We could do runtime type checking of deps against `self._deps_type`, but that's a slippery slope. """ if some_deps := self._override_deps.get(): return some_deps.value else: return deps def _get_toolset( self, output_toolset: AbstractToolset[AgentDepsT] | None | _utils.Unset = _utils.UNSET, additional_toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, ) -> AbstractToolset[AgentDepsT]: """Get the complete toolset. Args: output_toolset: The output toolset to use instead of the one built at agent construction time. additional_toolsets: Additional toolsets to add, unless toolsets have been overridden. """ toolsets = self.toolsets # Don't add additional toolsets if the toolsets have been overridden if additional_toolsets and self._override_toolsets.get() is None: toolsets = [*toolsets, *additional_toolsets] toolset = CombinedToolset(toolsets) # Copy the dynamic toolsets to ensure each run has its own instances def copy_dynamic_toolsets(toolset: AbstractToolset[AgentDepsT]) -> AbstractToolset[AgentDepsT]: if isinstance(toolset, DynamicToolset): return dataclasses.replace(toolset) else: return toolset toolset = toolset.visit_and_replace(copy_dynamic_toolsets) if self._prepare_tools: toolset = PreparedToolset(toolset, self._prepare_tools) output_toolset = output_toolset if _utils.is_set(output_toolset) else self._output_toolset if output_toolset is not None: if self._prepare_output_tools: output_toolset = PreparedToolset(output_toolset, self._prepare_output_tools) toolset = CombinedToolset([output_toolset, toolset]) return toolset @property def toolsets(self) -> Sequence[AbstractToolset[AgentDepsT]]: """All toolsets registered on the agent, including a function toolset holding tools that were registered on the agent directly. Output tools are not included. """ toolsets: list[AbstractToolset[AgentDepsT]] = [] if some_tools := self._override_tools.get(): function_toolset = _AgentFunctionToolset( some_tools.value, max_retries=self._max_tool_retries, output_schema=self._output_schema ) else: function_toolset = self._function_toolset toolsets.append(function_toolset) if some_user_toolsets := self._override_toolsets.get(): user_toolsets = some_user_toolsets.value else: user_toolsets = [*self._user_toolsets, *self._dynamic_toolsets] toolsets.extend(user_toolsets) return toolsets def _prepare_output_schema( self, output_type: OutputSpec[RunOutputDataT] | None, model_profile: ModelProfile ) -> _output.OutputSchema[RunOutputDataT]: if output_type is not None: if self._output_validators: raise exceptions.UserError('Cannot set a custom run `output_type` when the agent has output validators') schema = _output.OutputSchema[RunOutputDataT].build( output_type, default_mode=model_profile.default_structured_output_mode ) else: schema = self._output_schema.with_default_mode(model_profile.default_structured_output_mode) schema.raise_if_unsupported(model_profile) return schema # pyright: ignore[reportReturnType] async def __aenter__(self) -> Self: """Enter the agent context. This will start all [`MCPServerStdio`s][pydantic_ai.mcp.MCPServerStdio] registered as `toolsets` so they are ready to be used. This is a no-op if the agent has already been entered. """ async with self._enter_lock: if self._entered_count == 0: async with AsyncExitStack() as exit_stack: toolset = self._get_toolset() await exit_stack.enter_async_context(toolset) self._exit_stack = exit_stack.pop_all() self._entered_count += 1 return self async def __aexit__(self, *args: Any) -> bool | None: async with self._enter_lock: self._entered_count -= 1 if self._entered_count == 0 and self._exit_stack is not None: await self._exit_stack.aclose() self._exit_stack = None def set_mcp_sampling_model(self, model: models.Model | models.KnownModelName | str | None = None) -> None: """Set the sampling model on all MCP servers registered with the agent. If no sampling model is provided, the agent's model will be used. """ try: sampling_model = models.infer_model(model) if model else self._get_model(None) except exceptions.UserError as e: raise exceptions.UserError('No sampling model provided and no model set on the agent.') from e from ..mcp import MCPServer def _set_sampling_model(toolset: AbstractToolset[AgentDepsT]) -> None: if isinstance(toolset, MCPServer): toolset.sampling_model = sampling_model self._get_toolset().apply(_set_sampling_model) @asynccontextmanager @deprecated( '`run_mcp_servers` is deprecated, use `async with agent:` instead. If you need to set a sampling model on all MCP servers, use `agent.set_mcp_sampling_model()`.' ) async def run_mcp_servers( self, model: models.Model | models.KnownModelName | str | None = None ) -> AsyncIterator[None]: """Run [`MCPServerStdio`s][pydantic_ai.mcp.MCPServerStdio] so they can be used by the agent. Deprecated: use [`async with agent`][pydantic_ai.agent.Agent.__aenter__] instead. If you need to set a sampling model on all MCP servers, use [`agent.set_mcp_sampling_model()`][pydantic_ai.agent.Agent.set_mcp_sampling_model]. Returns: a context manager to start and shutdown the servers. """ try: self.set_mcp_sampling_model(model) except exceptions.UserError: if model is not None: raise async with self: yield ```` #### __init__ ```python __init__( model: Model | KnownModelName | str | None = None, *, output_type: OutputSpec[OutputDataT] = str, instructions: ( str | SystemPromptFunc[AgentDepsT] | Sequence[str | SystemPromptFunc[AgentDepsT]] | None ) = None, system_prompt: str | Sequence[str] = (), deps_type: type[AgentDepsT] = NoneType, name: str | None = None, model_settings: ModelSettings | None = None, retries: int = 1, output_retries: int | None = None, tools: Sequence[ Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...] ] = (), builtin_tools: Sequence[AbstractBuiltinTool] = (), prepare_tools: ( ToolsPrepareFunc[AgentDepsT] | None ) = None, prepare_output_tools: ( ToolsPrepareFunc[AgentDepsT] | None ) = None, toolsets: ( Sequence[ AbstractToolset[AgentDepsT] | ToolsetFunc[AgentDepsT] ] | None ) = None, defer_model_check: bool = False, end_strategy: EndStrategy = "early", instrument: ( InstrumentationSettings | bool | None ) = None, history_processors: ( Sequence[HistoryProcessor[AgentDepsT]] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> None ``` ```python __init__( model: Model | KnownModelName | str | None = None, *, output_type: OutputSpec[OutputDataT] = str, instructions: ( str | SystemPromptFunc[AgentDepsT] | Sequence[str | SystemPromptFunc[AgentDepsT]] | None ) = None, system_prompt: str | Sequence[str] = (), deps_type: type[AgentDepsT] = NoneType, name: str | None = None, model_settings: ModelSettings | None = None, retries: int = 1, output_retries: int | None = None, tools: Sequence[ Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...] ] = (), builtin_tools: Sequence[AbstractBuiltinTool] = (), prepare_tools: ( ToolsPrepareFunc[AgentDepsT] | None ) = None, prepare_output_tools: ( ToolsPrepareFunc[AgentDepsT] | None ) = None, mcp_servers: Sequence[MCPServer] = (), defer_model_check: bool = False, end_strategy: EndStrategy = "early", instrument: ( InstrumentationSettings | bool | None ) = None, history_processors: ( Sequence[HistoryProcessor[AgentDepsT]] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> None ``` ```python __init__( model: Model | KnownModelName | str | None = None, *, output_type: OutputSpec[OutputDataT] = str, instructions: ( str | SystemPromptFunc[AgentDepsT] | Sequence[str | SystemPromptFunc[AgentDepsT]] | None ) = None, system_prompt: str | Sequence[str] = (), deps_type: type[AgentDepsT] = NoneType, name: str | None = None, model_settings: ModelSettings | None = None, retries: int = 1, output_retries: int | None = None, tools: Sequence[ Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...] ] = (), builtin_tools: Sequence[AbstractBuiltinTool] = (), prepare_tools: ( ToolsPrepareFunc[AgentDepsT] | None ) = None, prepare_output_tools: ( ToolsPrepareFunc[AgentDepsT] | None ) = None, toolsets: ( Sequence[ AbstractToolset[AgentDepsT] | ToolsetFunc[AgentDepsT] ] | None ) = None, defer_model_check: bool = False, end_strategy: EndStrategy = "early", instrument: ( InstrumentationSettings | bool | None ) = None, history_processors: ( Sequence[HistoryProcessor[AgentDepsT]] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None, **_deprecated_kwargs: Any ) ``` Create an agent. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `model` | `Model | KnownModelName | str | None` | The default model to use for this agent, if not provided, you must provide the model when calling it. We allow str here since the actual list of allowed models changes frequently. | `None` | | `output_type` | `OutputSpec[OutputDataT]` | The type of the output data, used to validate the data returned by the model, defaults to str. | `str` | | `instructions` | `str | SystemPromptFunc[AgentDepsT] | Sequence[str | SystemPromptFunc[AgentDepsT]] | None` | Instructions to use for this agent, you can also register instructions via a function with instructions. | `None` | | `system_prompt` | `str | Sequence[str]` | Static system prompts to use for this agent, you can also register system prompts via a function with system_prompt. | `()` | | `deps_type` | `type[AgentDepsT]` | The type used for dependency injection, this parameter exists solely to allow you to fully parameterize the agent, and therefore get the best out of static type checking. If you're not using deps, but want type checking to pass, you can set deps=None to satisfy Pyright or add a type hint : Agent\[None, \]. | `NoneType` | | `name` | `str | None` | The name of the agent, used for logging. If None, we try to infer the agent name from the call frame when the agent is first run. | `None` | | `model_settings` | `ModelSettings | None` | Optional model request settings to use for this agent's runs, by default. | `None` | | `retries` | `int` | The default number of retries to allow before raising an error. | `1` | | `output_retries` | `int | None` | The maximum number of retries to allow for output validation, defaults to retries. | `None` | | `tools` | `Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]]` | Tools to register with the agent, you can also register tools via the decorators @agent.tool and @agent.tool_plain. | `()` | | `builtin_tools` | `Sequence[AbstractBuiltinTool]` | The builtin tools that the agent will use. This depends on the model, as some models may not support certain tools. If the model doesn't support the builtin tools, an error will be raised. | `()` | | `prepare_tools` | `ToolsPrepareFunc[AgentDepsT] | None` | Custom function to prepare the tool definition of all tools for each step, except output tools. This is useful if you want to customize the definition of multiple tools or you want to register a subset of tools for a given step. See ToolsPrepareFunc | `None` | | `prepare_output_tools` | `ToolsPrepareFunc[AgentDepsT] | None` | Custom function to prepare the tool definition of all output tools for each step. This is useful if you want to customize the definition of multiple output tools or you want to register a subset of output tools for a given step. See ToolsPrepareFunc | `None` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT] | ToolsetFunc[AgentDepsT]] | None` | Toolsets to register with the agent, including MCP servers and functions which take a run context and return a toolset. See ToolsetFunc for more information. | `None` | | `defer_model_check` | `bool` | by default, if you provide a named model, it's evaluated to create a Model instance immediately, which checks for the necessary environment variables. Set this to false to defer the evaluation until the first run. Useful if you want to override the model for testing. | `False` | | `end_strategy` | `EndStrategy` | Strategy for handling tool calls that are requested alongside a final result. See EndStrategy for more information. | `'early'` | | `instrument` | `InstrumentationSettings | bool | None` | Set to True to automatically instrument with OpenTelemetry, which will use Logfire if it's configured. Set to an instance of InstrumentationSettings to customize. If this isn't set, then the last value set by Agent.instrument_all() will be used, which defaults to False. See the Debugging and Monitoring guide for more info. | `None` | | `history_processors` | `Sequence[HistoryProcessor[AgentDepsT]] | None` | Optional list of callables to process the message history before sending it to the model. Each processor takes a list of messages and returns a modified list of messages. Processors can be sync or async and are applied in sequence. | `None` | | `event_stream_handler` | `EventStreamHandler[AgentDepsT] | None` | Optional handler for events from the model's streaming response and the agent's execution of tools. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/agent/__init__.py` ```python def __init__( self, model: models.Model | models.KnownModelName | str | None = None, *, output_type: OutputSpec[OutputDataT] = str, instructions: str | _system_prompt.SystemPromptFunc[AgentDepsT] | Sequence[str | _system_prompt.SystemPromptFunc[AgentDepsT]] | None = None, system_prompt: str | Sequence[str] = (), deps_type: type[AgentDepsT] = NoneType, name: str | None = None, model_settings: ModelSettings | None = None, retries: int = 1, output_retries: int | None = None, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] = (), builtin_tools: Sequence[AbstractBuiltinTool] = (), prepare_tools: ToolsPrepareFunc[AgentDepsT] | None = None, prepare_output_tools: ToolsPrepareFunc[AgentDepsT] | None = None, toolsets: Sequence[AbstractToolset[AgentDepsT] | ToolsetFunc[AgentDepsT]] | None = None, defer_model_check: bool = False, end_strategy: EndStrategy = 'early', instrument: InstrumentationSettings | bool | None = None, history_processors: Sequence[HistoryProcessor[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Any, ): """Create an agent. Args: model: The default model to use for this agent, if not provided, you must provide the model when calling it. We allow `str` here since the actual list of allowed models changes frequently. output_type: The type of the output data, used to validate the data returned by the model, defaults to `str`. instructions: Instructions to use for this agent, you can also register instructions via a function with [`instructions`][pydantic_ai.Agent.instructions]. system_prompt: Static system prompts to use for this agent, you can also register system prompts via a function with [`system_prompt`][pydantic_ai.Agent.system_prompt]. deps_type: The type used for dependency injection, this parameter exists solely to allow you to fully parameterize the agent, and therefore get the best out of static type checking. If you're not using deps, but want type checking to pass, you can set `deps=None` to satisfy Pyright or add a type hint `: Agent[None, ]`. name: The name of the agent, used for logging. If `None`, we try to infer the agent name from the call frame when the agent is first run. model_settings: Optional model request settings to use for this agent's runs, by default. retries: The default number of retries to allow before raising an error. output_retries: The maximum number of retries to allow for output validation, defaults to `retries`. tools: Tools to register with the agent, you can also register tools via the decorators [`@agent.tool`][pydantic_ai.Agent.tool] and [`@agent.tool_plain`][pydantic_ai.Agent.tool_plain]. builtin_tools: The builtin tools that the agent will use. This depends on the model, as some models may not support certain tools. If the model doesn't support the builtin tools, an error will be raised. prepare_tools: Custom function to prepare the tool definition of all tools for each step, except output tools. This is useful if you want to customize the definition of multiple tools or you want to register a subset of tools for a given step. See [`ToolsPrepareFunc`][pydantic_ai.tools.ToolsPrepareFunc] prepare_output_tools: Custom function to prepare the tool definition of all output tools for each step. This is useful if you want to customize the definition of multiple output tools or you want to register a subset of output tools for a given step. See [`ToolsPrepareFunc`][pydantic_ai.tools.ToolsPrepareFunc] toolsets: Toolsets to register with the agent, including MCP servers and functions which take a run context and return a toolset. See [`ToolsetFunc`][pydantic_ai.toolsets.ToolsetFunc] for more information. defer_model_check: by default, if you provide a [named][pydantic_ai.models.KnownModelName] model, it's evaluated to create a [`Model`][pydantic_ai.models.Model] instance immediately, which checks for the necessary environment variables. Set this to `false` to defer the evaluation until the first run. Useful if you want to [override the model][pydantic_ai.Agent.override] for testing. end_strategy: Strategy for handling tool calls that are requested alongside a final result. See [`EndStrategy`][pydantic_ai.agent.EndStrategy] for more information. instrument: Set to True to automatically instrument with OpenTelemetry, which will use Logfire if it's configured. Set to an instance of [`InstrumentationSettings`][pydantic_ai.agent.InstrumentationSettings] to customize. If this isn't set, then the last value set by [`Agent.instrument_all()`][pydantic_ai.Agent.instrument_all] will be used, which defaults to False. See the [Debugging and Monitoring guide](https://ai.pydantic.dev/logfire/) for more info. history_processors: Optional list of callables to process the message history before sending it to the model. Each processor takes a list of messages and returns a modified list of messages. Processors can be sync or async and are applied in sequence. event_stream_handler: Optional handler for events from the model's streaming response and the agent's execution of tools. """ if model is None or defer_model_check: self._model = model else: self._model = models.infer_model(model) self._name = name self.end_strategy = end_strategy self.model_settings = model_settings self._output_type = output_type self.instrument = instrument self._deps_type = deps_type if mcp_servers := _deprecated_kwargs.pop('mcp_servers', None): if toolsets is not None: # pragma: no cover raise TypeError('`mcp_servers` and `toolsets` cannot be set at the same time.') warnings.warn('`mcp_servers` is deprecated, use `toolsets` instead', DeprecationWarning) toolsets = mcp_servers _utils.validate_empty_kwargs(_deprecated_kwargs) default_output_mode = ( self.model.profile.default_structured_output_mode if isinstance(self.model, models.Model) else None ) self._output_schema = _output.OutputSchema[OutputDataT].build(output_type, default_mode=default_output_mode) self._output_validators = [] self._instructions = '' self._instructions_functions = [] if isinstance(instructions, str | Callable): instructions = [instructions] for instruction in instructions or []: if isinstance(instruction, str): self._instructions += instruction + '\n' else: self._instructions_functions.append(_system_prompt.SystemPromptRunner(instruction)) self._instructions = self._instructions.strip() or None self._system_prompts = (system_prompt,) if isinstance(system_prompt, str) else tuple(system_prompt) self._system_prompt_functions = [] self._system_prompt_dynamic_functions = {} self._max_result_retries = output_retries if output_retries is not None else retries self._max_tool_retries = retries self._builtin_tools = builtin_tools self._prepare_tools = prepare_tools self._prepare_output_tools = prepare_output_tools self._output_toolset = self._output_schema.toolset if self._output_toolset: self._output_toolset.max_retries = self._max_result_retries self._function_toolset = _AgentFunctionToolset( tools, max_retries=self._max_tool_retries, output_schema=self._output_schema ) self._dynamic_toolsets = [ DynamicToolset[AgentDepsT](toolset_func=toolset) for toolset in toolsets or [] if not isinstance(toolset, AbstractToolset) ] self._user_toolsets = [toolset for toolset in toolsets or [] if isinstance(toolset, AbstractToolset)] self.history_processors = history_processors or [] self._event_stream_handler = event_stream_handler self._override_deps: ContextVar[_utils.Option[AgentDepsT]] = ContextVar('_override_deps', default=None) self._override_model: ContextVar[_utils.Option[models.Model]] = ContextVar('_override_model', default=None) self._override_toolsets: ContextVar[_utils.Option[Sequence[AbstractToolset[AgentDepsT]]]] = ContextVar( '_override_toolsets', default=None ) self._override_tools: ContextVar[ _utils.Option[Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]]] ] = ContextVar('_override_tools', default=None) self._enter_lock = Lock() self._entered_count = 0 self._exit_stack = None ``` #### end_strategy ```python end_strategy: EndStrategy = end_strategy ``` Strategy for handling tool calls when a final result is found. #### model_settings ```python model_settings: ModelSettings | None = model_settings ``` Optional model request settings to use for this agents's runs, by default. Note, if `model_settings` is provided by `run`, `run_sync`, or `run_stream`, those settings will be merged with this value, with the runtime argument taking priority. #### instrument ```python instrument: InstrumentationSettings | bool | None = ( instrument ) ``` Options to automatically instrument with OpenTelemetry. #### instrument_all ```python instrument_all( instrument: InstrumentationSettings | bool = True, ) -> None ``` Set the instrumentation options for all agents where `instrument` is not set. Source code in `pydantic_ai_slim/pydantic_ai/agent/__init__.py` ```python @staticmethod def instrument_all(instrument: InstrumentationSettings | bool = True) -> None: """Set the instrumentation options for all agents where `instrument` is not set.""" Agent._instrument_default = instrument ``` #### model ```python model: Model | KnownModelName | str | None ``` The default model configured for this agent. #### name ```python name: str | None ``` The name of the agent, used for logging. If `None`, we try to infer the agent name from the call frame when the agent is first run. #### deps_type ```python deps_type: type ``` The type of dependencies used by the agent. #### output_type ```python output_type: OutputSpec[OutputDataT] ``` The type of data output by agent runs, used to validate the data returned by the model, defaults to `str`. #### event_stream_handler ```python event_stream_handler: EventStreamHandler[AgentDepsT] | None ``` Optional handler for events from the model's streaming response and the agent's execution of tools. #### iter ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: list[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None ) -> AbstractAsyncContextManager[ AgentRun[AgentDepsT, OutputDataT] ] ``` ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: list[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None ) -> AbstractAsyncContextManager[ AgentRun[AgentDepsT, RunOutputDataT] ] ``` ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: list[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None ) -> AsyncIterator[AgentRun[AgentDepsT, Any]] ``` A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions=None, instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `user_prompt` | `str | Sequence[UserContent] | None` | User input to start/continue the conversation. | `None` | | `output_type` | `OutputSpec[RunOutputDataT] | None` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | `None` | | `message_history` | `list[ModelMessage] | None` | History of the conversation so far. | `None` | | `deferred_tool_results` | `DeferredToolResults | None` | Optional results for deferred tool calls in the message history. | `None` | | `model` | `Model | KnownModelName | str | None` | Optional model to use for this run, required if model was not set when creating the agent. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | `ModelSettings | None` | Optional settings to use for this model's request. | `None` | | `usage_limits` | `UsageLimits | None` | Optional limits on model request count or token usage. | `None` | | `usage` | `RunUsage | None` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | `None` | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | None` | Optional additional toolsets for this run. | `None` | Returns: | Type | Description | | --- | --- | | `AsyncIterator[AgentRun[AgentDepsT, Any]]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/agent/__init__.py` ````python @asynccontextmanager async def iter( # noqa: C901 self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, ) -> AsyncIterator[AgentRun[AgentDepsT, Any]]: """A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions=None, instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. Returns: The result of the run. """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) model_used = self._get_model(model) del model deps = self._get_deps(deps) new_message_index = len(message_history) if message_history else 0 output_schema = self._prepare_output_schema(output_type, model_used.profile) output_type_ = output_type or self.output_type # We consider it a user error if a user tries to restrict the result type while having an output validator that # may change the result type from the restricted type to something else. Therefore, we consider the following # typecast reasonable, even though it is possible to violate it with otherwise-type-checked code. output_validators = cast(list[_output.OutputValidator[AgentDepsT, RunOutputDataT]], self._output_validators) output_toolset = self._output_toolset if output_schema != self._output_schema or output_validators: output_toolset = cast(OutputToolset[AgentDepsT], output_schema.toolset) if output_toolset: output_toolset.max_retries = self._max_result_retries output_toolset.output_validators = output_validators toolset = self._get_toolset(output_toolset=output_toolset, additional_toolsets=toolsets) tool_manager = ToolManager[AgentDepsT](toolset) # Build the graph graph: Graph[_agent_graph.GraphAgentState, _agent_graph.GraphAgentDeps[AgentDepsT, Any], FinalResult[Any]] = ( _agent_graph.build_agent_graph(self.name, self._deps_type, output_type_) ) # Build the initial state usage = usage or _usage.RunUsage() state = _agent_graph.GraphAgentState( message_history=message_history[:] if message_history else [], usage=usage, retries=0, run_step=0, ) # Merge model settings in order of precedence: run > agent > model merged_settings = merge_model_settings(model_used.settings, self.model_settings) model_settings = merge_model_settings(merged_settings, model_settings) usage_limits = usage_limits or _usage.UsageLimits() async def get_instructions(run_context: RunContext[AgentDepsT]) -> str | None: parts = [ self._instructions, *[await func.run(run_context) for func in self._instructions_functions], ] model_profile = model_used.profile if isinstance(output_schema, _output.PromptedOutputSchema): instructions = output_schema.instructions(model_profile.prompted_output_template) parts.append(instructions) parts = [p for p in parts if p] if not parts: return None return '\n\n'.join(parts).strip() if isinstance(model_used, InstrumentedModel): instrumentation_settings = model_used.instrumentation_settings tracer = model_used.instrumentation_settings.tracer else: instrumentation_settings = None tracer = NoOpTracer() tool_call_results: dict[str, DeferredToolResult] | None = None if deferred_tool_results is not None: tool_call_results = {} for tool_call_id, approval in deferred_tool_results.approvals.items(): if approval is True: approval = ToolApproved() elif approval is False: approval = ToolDenied() tool_call_results[tool_call_id] = approval if calls := deferred_tool_results.calls: call_result_types = _utils.get_union_args(DeferredToolCallResult) for tool_call_id, result in calls.items(): if not isinstance(result, call_result_types): result = _messages.ToolReturn(result) tool_call_results[tool_call_id] = result graph_deps = _agent_graph.GraphAgentDeps[AgentDepsT, RunOutputDataT]( user_deps=deps, prompt=user_prompt, new_message_index=new_message_index, model=model_used, model_settings=model_settings, usage_limits=usage_limits, max_result_retries=self._max_result_retries, end_strategy=self.end_strategy, output_schema=output_schema, output_validators=output_validators, history_processors=self.history_processors, builtin_tools=list(self._builtin_tools), tool_manager=tool_manager, tool_call_results=tool_call_results, tracer=tracer, get_instructions=get_instructions, instrumentation_settings=instrumentation_settings, ) start_node = _agent_graph.UserPromptNode[AgentDepsT]( user_prompt=user_prompt, instructions=self._instructions, instructions_functions=self._instructions_functions, system_prompts=self._system_prompts, system_prompt_functions=self._system_prompt_functions, system_prompt_dynamic_functions=self._system_prompt_dynamic_functions, ) agent_name = self.name or 'agent' run_span = tracer.start_span( 'agent run', attributes={ 'model_name': model_used.model_name if model_used else 'no-model', 'agent_name': agent_name, 'logfire.msg': f'{agent_name} run', }, ) try: async with toolset: async with graph.iter( start_node, state=state, deps=graph_deps, span=use_span(run_span) if run_span.is_recording() else None, infer_name=False, ) as graph_run: agent_run = AgentRun(graph_run) yield agent_run if (final_result := agent_run.result) is not None and run_span.is_recording(): if instrumentation_settings and instrumentation_settings.include_content: run_span.set_attribute( 'final_result', ( final_result.output if isinstance(final_result.output, str) else json.dumps(InstrumentedModel.serialize_any(final_result.output)) ), ) finally: try: if instrumentation_settings and run_span.is_recording(): run_span.set_attributes(self._run_span_end_attributes(state, usage, instrumentation_settings)) finally: run_span.end() ```` #### override ```python override( *, deps: AgentDepsT | Unset = UNSET, model: Model | KnownModelName | str | Unset = UNSET, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | Unset ) = UNSET, tools: ( Sequence[ Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...] ] | Unset ) = UNSET ) -> Iterator[None] ``` Context manager to temporarily override agent dependencies, model, toolsets, or tools. This is particularly useful when testing. You can find an example of this [here](../../testing/#overriding-model-via-pytest-fixtures). Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `deps` | `AgentDepsT | Unset` | The dependencies to use instead of the dependencies passed to the agent run. | `UNSET` | | `model` | `Model | KnownModelName | str | Unset` | The model to use instead of the model passed to the agent run. | `UNSET` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | Unset` | The toolsets to use instead of the toolsets passed to the agent constructor and agent run. | `UNSET` | | `tools` | `Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | Unset` | The tools to use instead of the tools registered with the agent. | `UNSET` | Source code in `pydantic_ai_slim/pydantic_ai/agent/__init__.py` ```python @contextmanager def override( self, *, deps: AgentDepsT | _utils.Unset = _utils.UNSET, model: models.Model | models.KnownModelName | str | _utils.Unset = _utils.UNSET, toolsets: Sequence[AbstractToolset[AgentDepsT]] | _utils.Unset = _utils.UNSET, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | _utils.Unset = _utils.UNSET, ) -> Iterator[None]: """Context manager to temporarily override agent dependencies, model, toolsets, or tools. This is particularly useful when testing. You can find an example of this [here](../testing.md#overriding-model-via-pytest-fixtures). Args: deps: The dependencies to use instead of the dependencies passed to the agent run. model: The model to use instead of the model passed to the agent run. toolsets: The toolsets to use instead of the toolsets passed to the agent constructor and agent run. tools: The tools to use instead of the tools registered with the agent. """ if _utils.is_set(deps): deps_token = self._override_deps.set(_utils.Some(deps)) else: deps_token = None if _utils.is_set(model): model_token = self._override_model.set(_utils.Some(models.infer_model(model))) else: model_token = None if _utils.is_set(toolsets): toolsets_token = self._override_toolsets.set(_utils.Some(toolsets)) else: toolsets_token = None if _utils.is_set(tools): tools_token = self._override_tools.set(_utils.Some(tools)) else: tools_token = None try: yield finally: if deps_token is not None: self._override_deps.reset(deps_token) if model_token is not None: self._override_model.reset(model_token) if toolsets_token is not None: self._override_toolsets.reset(toolsets_token) if tools_token is not None: self._override_tools.reset(tools_token) ``` #### instructions ```python instructions( func: Callable[[RunContext[AgentDepsT]], str], ) -> Callable[[RunContext[AgentDepsT]], str] ``` ```python instructions( func: Callable[ [RunContext[AgentDepsT]], Awaitable[str] ], ) -> Callable[[RunContext[AgentDepsT]], Awaitable[str]] ``` ```python instructions(func: Callable[[], str]) -> Callable[[], str] ``` ```python instructions( func: Callable[[], Awaitable[str]], ) -> Callable[[], Awaitable[str]] ``` ```python instructions() -> Callable[ [SystemPromptFunc[AgentDepsT]], SystemPromptFunc[AgentDepsT], ] ``` ```python instructions( func: SystemPromptFunc[AgentDepsT] | None = None, ) -> ( Callable[ [SystemPromptFunc[AgentDepsT]], SystemPromptFunc[AgentDepsT], ] | SystemPromptFunc[AgentDepsT] ) ``` Decorator to register an instructions function. Optionally takes RunContext as its only argument. Can decorate a sync or async functions. The decorator can be used bare (`agent.instructions`). Overloads for every possible signature of `instructions` are included so the decorator doesn't obscure the type of the function. Example: ```python from pydantic_ai import Agent, RunContext agent = Agent('test', deps_type=str) @agent.instructions def simple_instructions() -> str: return 'foobar' @agent.instructions async def async_instructions(ctx: RunContext[str]) -> str: return f'{ctx.deps} is the best' ``` Source code in `pydantic_ai_slim/pydantic_ai/agent/__init__.py` ````python def instructions( self, func: _system_prompt.SystemPromptFunc[AgentDepsT] | None = None, /, ) -> ( Callable[[_system_prompt.SystemPromptFunc[AgentDepsT]], _system_prompt.SystemPromptFunc[AgentDepsT]] | _system_prompt.SystemPromptFunc[AgentDepsT] ): """Decorator to register an instructions function. Optionally takes [`RunContext`][pydantic_ai.tools.RunContext] as its only argument. Can decorate a sync or async functions. The decorator can be used bare (`agent.instructions`). Overloads for every possible signature of `instructions` are included so the decorator doesn't obscure the type of the function. Example: ```python from pydantic_ai import Agent, RunContext agent = Agent('test', deps_type=str) @agent.instructions def simple_instructions() -> str: return 'foobar' @agent.instructions async def async_instructions(ctx: RunContext[str]) -> str: return f'{ctx.deps} is the best' ``` """ if func is None: def decorator( func_: _system_prompt.SystemPromptFunc[AgentDepsT], ) -> _system_prompt.SystemPromptFunc[AgentDepsT]: self._instructions_functions.append(_system_prompt.SystemPromptRunner(func_)) return func_ return decorator else: self._instructions_functions.append(_system_prompt.SystemPromptRunner(func)) return func ```` #### system_prompt ```python system_prompt( func: Callable[[RunContext[AgentDepsT]], str], ) -> Callable[[RunContext[AgentDepsT]], str] ``` ```python system_prompt( func: Callable[ [RunContext[AgentDepsT]], Awaitable[str] ], ) -> Callable[[RunContext[AgentDepsT]], Awaitable[str]] ``` ```python system_prompt(func: Callable[[], str]) -> Callable[[], str] ``` ```python system_prompt( func: Callable[[], Awaitable[str]], ) -> Callable[[], Awaitable[str]] ``` ```python system_prompt(*, dynamic: bool = False) -> Callable[ [SystemPromptFunc[AgentDepsT]], SystemPromptFunc[AgentDepsT], ] ``` ```python system_prompt( func: SystemPromptFunc[AgentDepsT] | None = None, /, *, dynamic: bool = False, ) -> ( Callable[ [SystemPromptFunc[AgentDepsT]], SystemPromptFunc[AgentDepsT], ] | SystemPromptFunc[AgentDepsT] ) ``` Decorator to register a system prompt function. Optionally takes RunContext as its only argument. Can decorate a sync or async functions. The decorator can be used either bare (`agent.system_prompt`) or as a function call (`agent.system_prompt(...)`), see the examples below. Overloads for every possible signature of `system_prompt` are included so the decorator doesn't obscure the type of the function, see `tests/typed_agent.py` for tests. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `func` | `SystemPromptFunc[AgentDepsT] | None` | The function to decorate | `None` | | `dynamic` | `bool` | If True, the system prompt will be reevaluated even when messages_history is provided, see SystemPromptPart.dynamic_ref | `False` | Example: ```python from pydantic_ai import Agent, RunContext agent = Agent('test', deps_type=str) @agent.system_prompt def simple_system_prompt() -> str: return 'foobar' @agent.system_prompt(dynamic=True) async def async_system_prompt(ctx: RunContext[str]) -> str: return f'{ctx.deps} is the best' ``` Source code in `pydantic_ai_slim/pydantic_ai/agent/__init__.py` ````python def system_prompt( self, func: _system_prompt.SystemPromptFunc[AgentDepsT] | None = None, /, *, dynamic: bool = False, ) -> ( Callable[[_system_prompt.SystemPromptFunc[AgentDepsT]], _system_prompt.SystemPromptFunc[AgentDepsT]] | _system_prompt.SystemPromptFunc[AgentDepsT] ): """Decorator to register a system prompt function. Optionally takes [`RunContext`][pydantic_ai.tools.RunContext] as its only argument. Can decorate a sync or async functions. The decorator can be used either bare (`agent.system_prompt`) or as a function call (`agent.system_prompt(...)`), see the examples below. Overloads for every possible signature of `system_prompt` are included so the decorator doesn't obscure the type of the function, see `tests/typed_agent.py` for tests. Args: func: The function to decorate dynamic: If True, the system prompt will be reevaluated even when `messages_history` is provided, see [`SystemPromptPart.dynamic_ref`][pydantic_ai.messages.SystemPromptPart.dynamic_ref] Example: ```python from pydantic_ai import Agent, RunContext agent = Agent('test', deps_type=str) @agent.system_prompt def simple_system_prompt() -> str: return 'foobar' @agent.system_prompt(dynamic=True) async def async_system_prompt(ctx: RunContext[str]) -> str: return f'{ctx.deps} is the best' ``` """ if func is None: def decorator( func_: _system_prompt.SystemPromptFunc[AgentDepsT], ) -> _system_prompt.SystemPromptFunc[AgentDepsT]: runner = _system_prompt.SystemPromptRunner[AgentDepsT](func_, dynamic=dynamic) self._system_prompt_functions.append(runner) if dynamic: # pragma: lax no cover self._system_prompt_dynamic_functions[func_.__qualname__] = runner return func_ return decorator else: assert not dynamic, "dynamic can't be True in this case" self._system_prompt_functions.append(_system_prompt.SystemPromptRunner[AgentDepsT](func, dynamic=dynamic)) return func ```` #### output_validator ```python output_validator( func: Callable[ [RunContext[AgentDepsT], OutputDataT], OutputDataT ], ) -> Callable[ [RunContext[AgentDepsT], OutputDataT], OutputDataT ] ``` ```python output_validator( func: Callable[ [RunContext[AgentDepsT], OutputDataT], Awaitable[OutputDataT], ], ) -> Callable[ [RunContext[AgentDepsT], OutputDataT], Awaitable[OutputDataT], ] ``` ```python output_validator( func: Callable[[OutputDataT], OutputDataT], ) -> Callable[[OutputDataT], OutputDataT] ``` ```python output_validator( func: Callable[[OutputDataT], Awaitable[OutputDataT]], ) -> Callable[[OutputDataT], Awaitable[OutputDataT]] ``` ```python output_validator( func: OutputValidatorFunc[AgentDepsT, OutputDataT], ) -> OutputValidatorFunc[AgentDepsT, OutputDataT] ``` Decorator to register an output validator function. Optionally takes RunContext as its first argument. Can decorate a sync or async functions. Overloads for every possible signature of `output_validator` are included so the decorator doesn't obscure the type of the function, see `tests/typed_agent.py` for tests. Example: ```python from pydantic_ai import Agent, ModelRetry, RunContext agent = Agent('test', deps_type=str) @agent.output_validator def output_validator_simple(data: str) -> str: if 'wrong' in data: raise ModelRetry('wrong response') return data @agent.output_validator async def output_validator_deps(ctx: RunContext[str], data: str) -> str: if ctx.deps in data: raise ModelRetry('wrong response') return data result = agent.run_sync('foobar', deps='spam') print(result.output) #> success (no tool calls) ``` Source code in `pydantic_ai_slim/pydantic_ai/agent/__init__.py` ````python def output_validator( self, func: _output.OutputValidatorFunc[AgentDepsT, OutputDataT], / ) -> _output.OutputValidatorFunc[AgentDepsT, OutputDataT]: """Decorator to register an output validator function. Optionally takes [`RunContext`][pydantic_ai.tools.RunContext] as its first argument. Can decorate a sync or async functions. Overloads for every possible signature of `output_validator` are included so the decorator doesn't obscure the type of the function, see `tests/typed_agent.py` for tests. Example: ```python from pydantic_ai import Agent, ModelRetry, RunContext agent = Agent('test', deps_type=str) @agent.output_validator def output_validator_simple(data: str) -> str: if 'wrong' in data: raise ModelRetry('wrong response') return data @agent.output_validator async def output_validator_deps(ctx: RunContext[str], data: str) -> str: if ctx.deps in data: raise ModelRetry('wrong response') return data result = agent.run_sync('foobar', deps='spam') print(result.output) #> success (no tool calls) ``` """ self._output_validators.append(_output.OutputValidator[AgentDepsT, Any](func)) return func ```` #### tool ```python tool( func: ToolFuncContext[AgentDepsT, ToolParams], ) -> ToolFuncContext[AgentDepsT, ToolParams] ``` ```python tool( *, name: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat = "auto", require_parameter_descriptions: bool = False, schema_generator: type[ GenerateJsonSchema ] = GenerateToolJsonSchema, strict: bool | None = None, requires_approval: bool = False ) -> Callable[ [ToolFuncContext[AgentDepsT, ToolParams]], ToolFuncContext[AgentDepsT, ToolParams], ] ``` ```python tool( func: ( ToolFuncContext[AgentDepsT, ToolParams] | None ) = None, /, *, name: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat = "auto", require_parameter_descriptions: bool = False, schema_generator: type[ GenerateJsonSchema ] = GenerateToolJsonSchema, strict: bool | None = None, requires_approval: bool = False, ) -> Any ``` Decorator to register a tool function which takes RunContext as its first argument. Can decorate a sync or async functions. The docstring is inspected to extract both the tool description and description of each parameter, [learn more](../../tools/#function-tools-and-schema). We can't add overloads for every possible signature of tool, since the return type is a recursive union so the signature of functions decorated with `@agent.tool` is obscured. Example: ```python from pydantic_ai import Agent, RunContext agent = Agent('test', deps_type=int) @agent.tool def foobar(ctx: RunContext[int], x: int) -> int: return ctx.deps + x @agent.tool(retries=2) async def spam(ctx: RunContext[str], y: float) -> float: return ctx.deps + y result = agent.run_sync('foobar', deps=1) print(result.output) #> {"foobar":1,"spam":1.0} ``` Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `func` | `ToolFuncContext[AgentDepsT, ToolParams] | None` | The tool function to register. | `None` | | `name` | `str | None` | The name of the tool, defaults to the function name. | `None` | | `retries` | `int | None` | The number of retries to allow for this tool, defaults to the agent's default retries, which defaults to 1. | `None` | | `prepare` | `ToolPrepareFunc[AgentDepsT] | None` | custom method to prepare the tool definition for each step, return None to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See ToolPrepareFunc. | `None` | | `docstring_format` | `DocstringFormat` | The format of the docstring, see DocstringFormat. Defaults to 'auto', such that the format is inferred from the structure of the docstring. | `'auto'` | | `require_parameter_descriptions` | `bool` | If True, raise an error if a parameter description is missing. Defaults to False. | `False` | | `schema_generator` | `type[GenerateJsonSchema]` | The JSON schema generator class to use for this tool. Defaults to GenerateToolJsonSchema. | `GenerateToolJsonSchema` | | `strict` | `bool | None` | Whether to enforce JSON schema compliance (only affects OpenAI). See ToolDefinition for more info. | `None` | | `requires_approval` | `bool` | Whether this tool requires human-in-the-loop approval. Defaults to False. See the tools documentation for more info. | `False` | Source code in `pydantic_ai_slim/pydantic_ai/agent/__init__.py` ````python def tool( self, func: ToolFuncContext[AgentDepsT, ToolParams] | None = None, /, *, name: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat = 'auto', require_parameter_descriptions: bool = False, schema_generator: type[GenerateJsonSchema] = GenerateToolJsonSchema, strict: bool | None = None, requires_approval: bool = False, ) -> Any: """Decorator to register a tool function which takes [`RunContext`][pydantic_ai.tools.RunContext] as its first argument. Can decorate a sync or async functions. The docstring is inspected to extract both the tool description and description of each parameter, [learn more](../tools.md#function-tools-and-schema). We can't add overloads for every possible signature of tool, since the return type is a recursive union so the signature of functions decorated with `@agent.tool` is obscured. Example: ```python from pydantic_ai import Agent, RunContext agent = Agent('test', deps_type=int) @agent.tool def foobar(ctx: RunContext[int], x: int) -> int: return ctx.deps + x @agent.tool(retries=2) async def spam(ctx: RunContext[str], y: float) -> float: return ctx.deps + y result = agent.run_sync('foobar', deps=1) print(result.output) #> {"foobar":1,"spam":1.0} ``` Args: func: The tool function to register. name: The name of the tool, defaults to the function name. retries: The number of retries to allow for this tool, defaults to the agent's default retries, which defaults to 1. prepare: custom method to prepare the tool definition for each step, return `None` to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See [`ToolPrepareFunc`][pydantic_ai.tools.ToolPrepareFunc]. docstring_format: The format of the docstring, see [`DocstringFormat`][pydantic_ai.tools.DocstringFormat]. Defaults to `'auto'`, such that the format is inferred from the structure of the docstring. require_parameter_descriptions: If True, raise an error if a parameter description is missing. Defaults to False. schema_generator: The JSON schema generator class to use for this tool. Defaults to `GenerateToolJsonSchema`. strict: Whether to enforce JSON schema compliance (only affects OpenAI). See [`ToolDefinition`][pydantic_ai.tools.ToolDefinition] for more info. requires_approval: Whether this tool requires human-in-the-loop approval. Defaults to False. See the [tools documentation](../deferred-tools.md#human-in-the-loop-tool-approval) for more info. """ def tool_decorator( func_: ToolFuncContext[AgentDepsT, ToolParams], ) -> ToolFuncContext[AgentDepsT, ToolParams]: # noinspection PyTypeChecker self._function_toolset.add_function( func_, True, name, retries, prepare, docstring_format, require_parameter_descriptions, schema_generator, strict, requires_approval, ) return func_ return tool_decorator if func is None else tool_decorator(func) ```` #### tool_plain ```python tool_plain( func: ToolFuncPlain[ToolParams], ) -> ToolFuncPlain[ToolParams] ``` ```python tool_plain( *, name: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat = "auto", require_parameter_descriptions: bool = False, schema_generator: type[ GenerateJsonSchema ] = GenerateToolJsonSchema, strict: bool | None = None, requires_approval: bool = False ) -> Callable[ [ToolFuncPlain[ToolParams]], ToolFuncPlain[ToolParams] ] ``` ```python tool_plain( func: ToolFuncPlain[ToolParams] | None = None, /, *, name: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat = "auto", require_parameter_descriptions: bool = False, schema_generator: type[ GenerateJsonSchema ] = GenerateToolJsonSchema, strict: bool | None = None, sequential: bool = False, requires_approval: bool = False, ) -> Any ``` Decorator to register a tool function which DOES NOT take `RunContext` as an argument. Can decorate a sync or async functions. The docstring is inspected to extract both the tool description and description of each parameter, [learn more](../../tools/#function-tools-and-schema). We can't add overloads for every possible signature of tool, since the return type is a recursive union so the signature of functions decorated with `@agent.tool` is obscured. Example: ```python from pydantic_ai import Agent, RunContext agent = Agent('test') @agent.tool def foobar(ctx: RunContext[int]) -> int: return 123 @agent.tool(retries=2) async def spam(ctx: RunContext[str]) -> float: return 3.14 result = agent.run_sync('foobar', deps=1) print(result.output) #> {"foobar":123,"spam":3.14} ``` Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `func` | `ToolFuncPlain[ToolParams] | None` | The tool function to register. | `None` | | `name` | `str | None` | The name of the tool, defaults to the function name. | `None` | | `retries` | `int | None` | The number of retries to allow for this tool, defaults to the agent's default retries, which defaults to 1. | `None` | | `prepare` | `ToolPrepareFunc[AgentDepsT] | None` | custom method to prepare the tool definition for each step, return None to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See ToolPrepareFunc. | `None` | | `docstring_format` | `DocstringFormat` | The format of the docstring, see DocstringFormat. Defaults to 'auto', such that the format is inferred from the structure of the docstring. | `'auto'` | | `require_parameter_descriptions` | `bool` | If True, raise an error if a parameter description is missing. Defaults to False. | `False` | | `schema_generator` | `type[GenerateJsonSchema]` | The JSON schema generator class to use for this tool. Defaults to GenerateToolJsonSchema. | `GenerateToolJsonSchema` | | `strict` | `bool | None` | Whether to enforce JSON schema compliance (only affects OpenAI). See ToolDefinition for more info. | `None` | | `sequential` | `bool` | Whether the function requires a sequential/serial execution environment. Defaults to False. | `False` | | `requires_approval` | `bool` | Whether this tool requires human-in-the-loop approval. Defaults to False. See the tools documentation for more info. | `False` | Source code in `pydantic_ai_slim/pydantic_ai/agent/__init__.py` ````python def tool_plain( self, func: ToolFuncPlain[ToolParams] | None = None, /, *, name: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat = 'auto', require_parameter_descriptions: bool = False, schema_generator: type[GenerateJsonSchema] = GenerateToolJsonSchema, strict: bool | None = None, sequential: bool = False, requires_approval: bool = False, ) -> Any: """Decorator to register a tool function which DOES NOT take `RunContext` as an argument. Can decorate a sync or async functions. The docstring is inspected to extract both the tool description and description of each parameter, [learn more](../tools.md#function-tools-and-schema). We can't add overloads for every possible signature of tool, since the return type is a recursive union so the signature of functions decorated with `@agent.tool` is obscured. Example: ```python from pydantic_ai import Agent, RunContext agent = Agent('test') @agent.tool def foobar(ctx: RunContext[int]) -> int: return 123 @agent.tool(retries=2) async def spam(ctx: RunContext[str]) -> float: return 3.14 result = agent.run_sync('foobar', deps=1) print(result.output) #> {"foobar":123,"spam":3.14} ``` Args: func: The tool function to register. name: The name of the tool, defaults to the function name. retries: The number of retries to allow for this tool, defaults to the agent's default retries, which defaults to 1. prepare: custom method to prepare the tool definition for each step, return `None` to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See [`ToolPrepareFunc`][pydantic_ai.tools.ToolPrepareFunc]. docstring_format: The format of the docstring, see [`DocstringFormat`][pydantic_ai.tools.DocstringFormat]. Defaults to `'auto'`, such that the format is inferred from the structure of the docstring. require_parameter_descriptions: If True, raise an error if a parameter description is missing. Defaults to False. schema_generator: The JSON schema generator class to use for this tool. Defaults to `GenerateToolJsonSchema`. strict: Whether to enforce JSON schema compliance (only affects OpenAI). See [`ToolDefinition`][pydantic_ai.tools.ToolDefinition] for more info. sequential: Whether the function requires a sequential/serial execution environment. Defaults to False. requires_approval: Whether this tool requires human-in-the-loop approval. Defaults to False. See the [tools documentation](../deferred-tools.md#human-in-the-loop-tool-approval) for more info. """ def tool_decorator(func_: ToolFuncPlain[ToolParams]) -> ToolFuncPlain[ToolParams]: # noinspection PyTypeChecker self._function_toolset.add_function( func_, False, name, retries, prepare, docstring_format, require_parameter_descriptions, schema_generator, strict, sequential, requires_approval, ) return func_ return tool_decorator if func is None else tool_decorator(func) ```` #### toolset ```python toolset( func: ToolsetFunc[AgentDepsT], ) -> ToolsetFunc[AgentDepsT] ``` ```python toolset( *, per_run_step: bool = True ) -> Callable[ [ToolsetFunc[AgentDepsT]], ToolsetFunc[AgentDepsT] ] ``` ```python toolset( func: ToolsetFunc[AgentDepsT] | None = None, /, *, per_run_step: bool = True, ) -> Any ``` Decorator to register a toolset function which takes RunContext as its only argument. Can decorate a sync or async functions. The decorator can be used bare (`agent.toolset`). Example: ```python from pydantic_ai import Agent, RunContext from pydantic_ai.toolsets import AbstractToolset, FunctionToolset agent = Agent('test', deps_type=str) @agent.toolset async def simple_toolset(ctx: RunContext[str]) -> AbstractToolset[str]: return FunctionToolset() ``` Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `func` | `ToolsetFunc[AgentDepsT] | None` | The toolset function to register. | `None` | | `per_run_step` | `bool` | Whether to re-evaluate the toolset for each run step. Defaults to True. | `True` | Source code in `pydantic_ai_slim/pydantic_ai/agent/__init__.py` ````python def toolset( self, func: ToolsetFunc[AgentDepsT] | None = None, /, *, per_run_step: bool = True, ) -> Any: """Decorator to register a toolset function which takes [`RunContext`][pydantic_ai.tools.RunContext] as its only argument. Can decorate a sync or async functions. The decorator can be used bare (`agent.toolset`). Example: ```python from pydantic_ai import Agent, RunContext from pydantic_ai.toolsets import AbstractToolset, FunctionToolset agent = Agent('test', deps_type=str) @agent.toolset async def simple_toolset(ctx: RunContext[str]) -> AbstractToolset[str]: return FunctionToolset() ``` Args: func: The toolset function to register. per_run_step: Whether to re-evaluate the toolset for each run step. Defaults to True. """ def toolset_decorator(func_: ToolsetFunc[AgentDepsT]) -> ToolsetFunc[AgentDepsT]: self._dynamic_toolsets.append(DynamicToolset(func_, per_run_step=per_run_step)) return func_ return toolset_decorator if func is None else toolset_decorator(func) ```` #### toolsets ```python toolsets: Sequence[AbstractToolset[AgentDepsT]] ``` All toolsets registered on the agent, including a function toolset holding tools that were registered on the agent directly. Output tools are not included. #### __aenter__ ```python __aenter__() -> Self ``` Enter the agent context. This will start all MCPServerStdios registered as `toolsets` so they are ready to be used. This is a no-op if the agent has already been entered. Source code in `pydantic_ai_slim/pydantic_ai/agent/__init__.py` ```python async def __aenter__(self) -> Self: """Enter the agent context. This will start all [`MCPServerStdio`s][pydantic_ai.mcp.MCPServerStdio] registered as `toolsets` so they are ready to be used. This is a no-op if the agent has already been entered. """ async with self._enter_lock: if self._entered_count == 0: async with AsyncExitStack() as exit_stack: toolset = self._get_toolset() await exit_stack.enter_async_context(toolset) self._exit_stack = exit_stack.pop_all() self._entered_count += 1 return self ``` #### set_mcp_sampling_model ```python set_mcp_sampling_model( model: Model | KnownModelName | str | None = None, ) -> None ``` Set the sampling model on all MCP servers registered with the agent. If no sampling model is provided, the agent's model will be used. Source code in `pydantic_ai_slim/pydantic_ai/agent/__init__.py` ```python def set_mcp_sampling_model(self, model: models.Model | models.KnownModelName | str | None = None) -> None: """Set the sampling model on all MCP servers registered with the agent. If no sampling model is provided, the agent's model will be used. """ try: sampling_model = models.infer_model(model) if model else self._get_model(None) except exceptions.UserError as e: raise exceptions.UserError('No sampling model provided and no model set on the agent.') from e from ..mcp import MCPServer def _set_sampling_model(toolset: AbstractToolset[AgentDepsT]) -> None: if isinstance(toolset, MCPServer): toolset.sampling_model = sampling_model self._get_toolset().apply(_set_sampling_model) ``` #### run_mcp_servers ```python run_mcp_servers( model: Model | KnownModelName | str | None = None, ) -> AsyncIterator[None] ``` Deprecated `run_mcp_servers` is deprecated, use `async with agent:` instead. If you need to set a sampling model on all MCP servers, use `agent.set_mcp_sampling_model()`. Run MCPServerStdios so they can be used by the agent. Deprecated: use async with agent instead. If you need to set a sampling model on all MCP servers, use agent.set_mcp_sampling_model(). Returns: a context manager to start and shutdown the servers. Source code in `pydantic_ai_slim/pydantic_ai/agent/__init__.py` ```python @asynccontextmanager @deprecated( '`run_mcp_servers` is deprecated, use `async with agent:` instead. If you need to set a sampling model on all MCP servers, use `agent.set_mcp_sampling_model()`.' ) async def run_mcp_servers( self, model: models.Model | models.KnownModelName | str | None = None ) -> AsyncIterator[None]: """Run [`MCPServerStdio`s][pydantic_ai.mcp.MCPServerStdio] so they can be used by the agent. Deprecated: use [`async with agent`][pydantic_ai.agent.Agent.__aenter__] instead. If you need to set a sampling model on all MCP servers, use [`agent.set_mcp_sampling_model()`][pydantic_ai.agent.Agent.set_mcp_sampling_model]. Returns: a context manager to start and shutdown the servers. """ try: self.set_mcp_sampling_model(model) except exceptions.UserError: if model is not None: raise async with self: yield ``` ### AbstractAgent Bases: `Generic[AgentDepsT, OutputDataT]`, `ABC` Abstract superclass for Agent, WrapperAgent, and your own custom agent implementations. Source code in `pydantic_ai_slim/pydantic_ai/agent/abstract.py` ````python class AbstractAgent(Generic[AgentDepsT, OutputDataT], ABC): """Abstract superclass for [`Agent`][pydantic_ai.agent.Agent], [`WrapperAgent`][pydantic_ai.agent.WrapperAgent], and your own custom agent implementations.""" @property @abstractmethod def model(self) -> models.Model | models.KnownModelName | str | None: """The default model configured for this agent.""" raise NotImplementedError @property @abstractmethod def name(self) -> str | None: """The name of the agent, used for logging. If `None`, we try to infer the agent name from the call frame when the agent is first run. """ raise NotImplementedError @name.setter @abstractmethod def name(self, value: str | None) -> None: """Set the name of the agent, used for logging.""" raise NotImplementedError @property @abstractmethod def deps_type(self) -> type: """The type of dependencies used by the agent.""" raise NotImplementedError @property @abstractmethod def output_type(self) -> OutputSpec[OutputDataT]: """The type of data output by agent runs, used to validate the data returned by the model, defaults to `str`.""" raise NotImplementedError @property @abstractmethod def event_stream_handler(self) -> EventStreamHandler[AgentDepsT] | None: """Optional handler for events from the model's streaming response and the agent's execution of tools.""" raise NotImplementedError @property @abstractmethod def toolsets(self) -> Sequence[AbstractToolset[AgentDepsT]]: """All toolsets registered on the agent. Output tools are not included. """ raise NotImplementedError @overload async def run( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[OutputDataT]: ... @overload async def run( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[RunOutputDataT]: ... async def run( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[Any]: """Run the agent with a user prompt in async mode. This method builds an internal agent graph (using system prompts, tools and output schemas) and then runs the graph to completion. The result of the run is returned. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): agent_run = await agent.run('What is the capital of France?') print(agent_run.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. event_stream_handler: Optional handler for events from the model's streaming response and the agent's execution of tools to use for this run. Returns: The result of the run. """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) event_stream_handler = event_stream_handler or self.event_stream_handler async with self.iter( user_prompt=user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, toolsets=toolsets, ) as agent_run: async for node in agent_run: if event_stream_handler is not None and ( self.is_model_request_node(node) or self.is_call_tools_node(node) ): async with node.stream(agent_run.ctx) as stream: await event_stream_handler(_agent_graph.build_run_context(agent_run.ctx), stream) assert agent_run.result is not None, 'The graph run did not finish properly' return agent_run.result @overload def run_sync( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[OutputDataT]: ... @overload def run_sync( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[RunOutputDataT]: ... def run_sync( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[Any]: """Synchronously run the agent with a user prompt. This is a convenience method that wraps [`self.run`][pydantic_ai.agent.AbstractAgent.run] with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') result_sync = agent.run_sync('What is the capital of Italy?') print(result_sync.output) #> The capital of Italy is Rome. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. event_stream_handler: Optional handler for events from the model's streaming response and the agent's execution of tools to use for this run. Returns: The result of the run. """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) return get_event_loop().run_until_complete( self.run( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=False, toolsets=toolsets, event_stream_handler=event_stream_handler, ) ) @overload def run_stream( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AbstractAsyncContextManager[result.StreamedRunResult[AgentDepsT, OutputDataT]]: ... @overload def run_stream( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AbstractAsyncContextManager[result.StreamedRunResult[AgentDepsT, RunOutputDataT]]: ... @asynccontextmanager async def run_stream( # noqa C901 self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AsyncIterator[result.StreamedRunResult[AgentDepsT, Any]]: """Run the agent with a user prompt in async streaming mode. This method builds an internal agent graph (using system prompts, tools and output schemas) and then runs the graph until the model produces output matching the `output_type`, for example text or structured data. At this point, a streaming run result object is yielded from which you can stream the output as it comes in, and -- once this output has completed streaming -- get the complete output, message history, and usage. As this method will consider the first output matching the `output_type` to be the final output, it will stop running the agent graph and will not execute any tool calls made by the model after this "final" output. If you want to always run the agent graph to completion and stream events and output at the same time, use [`agent.run()`][pydantic_ai.agent.AbstractAgent.run] with an `event_stream_handler` or [`agent.iter()`][pydantic_ai.agent.AbstractAgent.iter] instead. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): async with agent.run_stream('What is the capital of the UK?') as response: print(await response.get_output()) #> The capital of the UK is London. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. event_stream_handler: Optional handler for events from the model's streaming response and the agent's execution of tools to use for this run. It will receive all the events up until the final result is found, which you can then read or stream from inside the context manager. Note that it does _not_ receive any events after the final result is found. Returns: The result of the run. """ if infer_name and self.name is None: # f_back because `asynccontextmanager` adds one frame if frame := inspect.currentframe(): # pragma: no branch self._infer_name(frame.f_back) event_stream_handler = event_stream_handler or self.event_stream_handler yielded = False async with self.iter( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=False, toolsets=toolsets, ) as agent_run: first_node = agent_run.next_node # start with the first node assert isinstance(first_node, _agent_graph.UserPromptNode) # the first node should be a user prompt node node = first_node while True: graph_ctx = agent_run.ctx if self.is_model_request_node(node): async with node.stream(graph_ctx) as stream: final_result_event = None async def stream_to_final( stream: AgentStream, ) -> AsyncIterator[_messages.ModelResponseStreamEvent]: nonlocal final_result_event async for event in stream: yield event if isinstance(event, _messages.FinalResultEvent): final_result_event = event break if event_stream_handler is not None: await event_stream_handler( _agent_graph.build_run_context(graph_ctx), stream_to_final(stream) ) else: async for _ in stream_to_final(stream): pass if final_result_event is not None: final_result = FinalResult( stream, final_result_event.tool_name, final_result_event.tool_call_id ) if yielded: raise exceptions.AgentRunError('Agent run produced final results') # pragma: no cover yielded = True messages = graph_ctx.state.message_history.copy() async def on_complete() -> None: """Called when the stream has completed. The model response will have been added to messages by now by `StreamedRunResult._marked_completed`. """ last_message = messages[-1] assert isinstance(last_message, _messages.ModelResponse) tool_calls = [ part for part in last_message.parts if isinstance(part, _messages.ToolCallPart) ] parts: list[_messages.ModelRequestPart] = [] async for _event in _agent_graph.process_function_tools( graph_ctx.deps.tool_manager, tool_calls, final_result, graph_ctx, parts, ): pass if parts: messages.append(_messages.ModelRequest(parts)) yield StreamedRunResult( messages, graph_ctx.deps.new_message_index, stream, on_complete, ) break elif self.is_call_tools_node(node) and event_stream_handler is not None: async with node.stream(agent_run.ctx) as stream: await event_stream_handler(_agent_graph.build_run_context(agent_run.ctx), stream) next_node = await agent_run.next(node) if isinstance(next_node, End) and agent_run.result is not None: # A final output could have been produced by the CallToolsNode rather than the ModelRequestNode, # if a tool function raised CallDeferred or ApprovalRequired. # In this case there's no response to stream, but we still let the user access the output etc as normal. yield StreamedRunResult( graph_ctx.state.message_history, graph_ctx.deps.new_message_index, run_result=agent_run.result, ) yielded = True break if not isinstance(next_node, _agent_graph.AgentNode): raise exceptions.AgentRunError( # pragma: no cover 'Should have produced a StreamedRunResult before getting here' ) node = cast(_agent_graph.AgentNode[Any, Any], next_node) if not yielded: raise exceptions.AgentRunError('Agent run finished without producing a final result') # pragma: no cover @overload def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, ) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, OutputDataT]]: ... @overload def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, ) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, RunOutputDataT]]: ... @asynccontextmanager @abstractmethod async def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, ) -> AsyncIterator[AgentRun[AgentDepsT, Any]]: """A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions=None, instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. Returns: The result of the run. """ raise NotImplementedError yield @contextmanager @abstractmethod def override( self, *, deps: AgentDepsT | _utils.Unset = _utils.UNSET, model: models.Model | models.KnownModelName | str | _utils.Unset = _utils.UNSET, toolsets: Sequence[AbstractToolset[AgentDepsT]] | _utils.Unset = _utils.UNSET, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | _utils.Unset = _utils.UNSET, ) -> Iterator[None]: """Context manager to temporarily override agent dependencies, model, toolsets, or tools. This is particularly useful when testing. You can find an example of this [here](../testing.md#overriding-model-via-pytest-fixtures). Args: deps: The dependencies to use instead of the dependencies passed to the agent run. model: The model to use instead of the model passed to the agent run. toolsets: The toolsets to use instead of the toolsets passed to the agent constructor and agent run. tools: The tools to use instead of the tools registered with the agent. """ raise NotImplementedError yield def _infer_name(self, function_frame: FrameType | None) -> None: """Infer the agent name from the call frame. RunUsage should be `self._infer_name(inspect.currentframe())`. """ assert self.name is None, 'Name already set' if function_frame is not None: # pragma: no branch if parent_frame := function_frame.f_back: # pragma: no branch for name, item in parent_frame.f_locals.items(): if item is self: self.name = name return if parent_frame.f_locals != parent_frame.f_globals: # pragma: no branch # if we couldn't find the agent in locals and globals are a different dict, try globals for name, item in parent_frame.f_globals.items(): if item is self: self.name = name return @staticmethod @contextmanager def sequential_tool_calls() -> Iterator[None]: """Run tool calls sequentially during the context.""" with ToolManager.sequential_tool_calls(): yield @staticmethod def is_model_request_node( node: _agent_graph.AgentNode[T, S] | End[result.FinalResult[S]], ) -> TypeIs[_agent_graph.ModelRequestNode[T, S]]: """Check if the node is a `ModelRequestNode`, narrowing the type if it is. This method preserves the generic parameters while narrowing the type, unlike a direct call to `isinstance`. """ return isinstance(node, _agent_graph.ModelRequestNode) @staticmethod def is_call_tools_node( node: _agent_graph.AgentNode[T, S] | End[result.FinalResult[S]], ) -> TypeIs[_agent_graph.CallToolsNode[T, S]]: """Check if the node is a `CallToolsNode`, narrowing the type if it is. This method preserves the generic parameters while narrowing the type, unlike a direct call to `isinstance`. """ return isinstance(node, _agent_graph.CallToolsNode) @staticmethod def is_user_prompt_node( node: _agent_graph.AgentNode[T, S] | End[result.FinalResult[S]], ) -> TypeIs[_agent_graph.UserPromptNode[T, S]]: """Check if the node is a `UserPromptNode`, narrowing the type if it is. This method preserves the generic parameters while narrowing the type, unlike a direct call to `isinstance`. """ return isinstance(node, _agent_graph.UserPromptNode) @staticmethod def is_end_node( node: _agent_graph.AgentNode[T, S] | End[result.FinalResult[S]], ) -> TypeIs[End[result.FinalResult[S]]]: """Check if the node is a `End`, narrowing the type if it is. This method preserves the generic parameters while narrowing the type, unlike a direct call to `isinstance`. """ return isinstance(node, End) @abstractmethod async def __aenter__(self) -> AbstractAgent[AgentDepsT, OutputDataT]: raise NotImplementedError @abstractmethod async def __aexit__(self, *args: Any) -> bool | None: raise NotImplementedError def to_ag_ui( self, *, # Agent.iter parameters output_type: OutputSpec[OutputDataT] | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, # Starlette debug: bool = False, routes: Sequence[BaseRoute] | None = None, middleware: Sequence[Middleware] | None = None, exception_handlers: Mapping[Any, ExceptionHandler] | None = None, on_startup: Sequence[Callable[[], Any]] | None = None, on_shutdown: Sequence[Callable[[], Any]] | None = None, lifespan: Lifespan[AGUIApp[AgentDepsT, OutputDataT]] | None = None, ) -> AGUIApp[AgentDepsT, OutputDataT]: """Returns an ASGI application that handles every AG-UI request by running the agent. Note that the `deps` will be the same for each request, with the exception of the AG-UI state that's injected into the `state` field of a `deps` object that implements the [`StateHandler`][pydantic_ai.ag_ui.StateHandler] protocol. To provide different `deps` for each request (e.g. based on the authenticated user), use [`pydantic_ai.ag_ui.run_ag_ui`][pydantic_ai.ag_ui.run_ag_ui] or [`pydantic_ai.ag_ui.handle_ag_ui_request`][pydantic_ai.ag_ui.handle_ag_ui_request] instead. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') app = agent.to_ag_ui() ``` The `app` is an ASGI application that can be used with any ASGI server. To run the application, you can use the following command: ```bash uvicorn app:app --host 0.0.0.0 --port 8000 ``` See [AG-UI docs](../ag-ui.md) for more information. Args: output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. debug: Boolean indicating if debug tracebacks should be returned on errors. routes: A list of routes to serve incoming HTTP and WebSocket requests. middleware: A list of middleware to run for every request. A starlette application will always automatically include two middleware classes. `ServerErrorMiddleware` is added as the very outermost middleware, to handle any uncaught errors occurring anywhere in the entire stack. `ExceptionMiddleware` is added as the very innermost middleware, to deal with handled exception cases occurring in the routing or endpoints. exception_handlers: A mapping of either integer status codes, or exception class types onto callables which handle the exceptions. Exception handler callables should be of the form `handler(request, exc) -> response` and may be either standard functions, or async functions. on_startup: A list of callables to run on application startup. Startup handler callables do not take any arguments, and may be either standard functions, or async functions. on_shutdown: A list of callables to run on application shutdown. Shutdown handler callables do not take any arguments, and may be either standard functions, or async functions. lifespan: A lifespan context function, which can be used to perform startup and shutdown tasks. This is a newer style that replaces the `on_startup` and `on_shutdown` handlers. Use one or the other, not both. Returns: An ASGI application for running Pydantic AI agents with AG-UI protocol support. """ from ..ag_ui import AGUIApp return AGUIApp( agent=self, # Agent.iter parameters output_type=output_type, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, # Starlette debug=debug, routes=routes, middleware=middleware, exception_handlers=exception_handlers, on_startup=on_startup, on_shutdown=on_shutdown, lifespan=lifespan, ) def to_a2a( self, *, storage: Storage | None = None, broker: Broker | None = None, # Agent card name: str | None = None, url: str = 'http://localhost:8000', version: str = '1.0.0', description: str | None = None, provider: AgentProvider | None = None, skills: list[Skill] | None = None, # Starlette debug: bool = False, routes: Sequence[Route] | None = None, middleware: Sequence[Middleware] | None = None, exception_handlers: dict[Any, ExceptionHandler] | None = None, lifespan: Lifespan[FastA2A] | None = None, ) -> FastA2A: """Convert the agent to a FastA2A application. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') app = agent.to_a2a() ``` The `app` is an ASGI application that can be used with any ASGI server. To run the application, you can use the following command: ```bash uvicorn app:app --host 0.0.0.0 --port 8000 ``` """ from .._a2a import agent_to_a2a return agent_to_a2a( self, storage=storage, broker=broker, name=name, url=url, version=version, description=description, provider=provider, skills=skills, debug=debug, routes=routes, middleware=middleware, exception_handlers=exception_handlers, lifespan=lifespan, ) async def to_cli( self: Self, deps: AgentDepsT = None, prog_name: str = 'pydantic-ai', message_history: list[_messages.ModelMessage] | None = None, ) -> None: """Run the agent in a CLI chat interface. Args: deps: The dependencies to pass to the agent. prog_name: The name of the program to use for the CLI. Defaults to 'pydantic-ai'. message_history: History of the conversation so far. Example: ```python {title="agent_to_cli.py" test="skip"} from pydantic_ai import Agent agent = Agent('openai:gpt-4o', instructions='You always respond in Italian.') async def main(): await agent.to_cli() ``` """ from rich.console import Console from pydantic_ai._cli import run_chat await run_chat( stream=True, agent=self, deps=deps, console=Console(), code_theme='monokai', prog_name=prog_name, message_history=message_history, ) def to_cli_sync( self: Self, deps: AgentDepsT = None, prog_name: str = 'pydantic-ai', message_history: list[_messages.ModelMessage] | None = None, ) -> None: """Run the agent in a CLI chat interface with the non-async interface. Args: deps: The dependencies to pass to the agent. prog_name: The name of the program to use for the CLI. Defaults to 'pydantic-ai'. message_history: History of the conversation so far. ```python {title="agent_to_cli_sync.py" test="skip"} from pydantic_ai import Agent agent = Agent('openai:gpt-4o', instructions='You always respond in Italian.') agent.to_cli_sync() agent.to_cli_sync(prog_name='assistant') ``` """ return get_event_loop().run_until_complete( self.to_cli(deps=deps, prog_name=prog_name, message_history=message_history) ) ```` #### model ```python model: Model | KnownModelName | str | None ``` The default model configured for this agent. #### name ```python name: str | None ``` The name of the agent, used for logging. If `None`, we try to infer the agent name from the call frame when the agent is first run. #### deps_type ```python deps_type: type ``` The type of dependencies used by the agent. #### output_type ```python output_type: OutputSpec[OutputDataT] ``` The type of data output by agent runs, used to validate the data returned by the model, defaults to `str`. #### event_stream_handler ```python event_stream_handler: EventStreamHandler[AgentDepsT] | None ``` Optional handler for events from the model's streaming response and the agent's execution of tools. #### toolsets ```python toolsets: Sequence[AbstractToolset[AgentDepsT]] ``` All toolsets registered on the agent. Output tools are not included. #### run ```python run( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: list[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AgentRunResult[OutputDataT] ``` ```python run( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: list[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AgentRunResult[RunOutputDataT] ``` ```python run( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: list[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AgentRunResult[Any] ``` Run the agent with a user prompt in async mode. This method builds an internal agent graph (using system prompts, tools and output schemas) and then runs the graph to completion. The result of the run is returned. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): agent_run = await agent.run('What is the capital of France?') print(agent_run.output) #> The capital of France is Paris. ``` Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `user_prompt` | `str | Sequence[UserContent] | None` | User input to start/continue the conversation. | `None` | | `output_type` | `OutputSpec[RunOutputDataT] | None` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | `None` | | `message_history` | `list[ModelMessage] | None` | History of the conversation so far. | `None` | | `deferred_tool_results` | `DeferredToolResults | None` | Optional results for deferred tool calls in the message history. | `None` | | `model` | `Model | KnownModelName | str | None` | Optional model to use for this run, required if model was not set when creating the agent. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | `ModelSettings | None` | Optional settings to use for this model's request. | `None` | | `usage_limits` | `UsageLimits | None` | Optional limits on model request count or token usage. | `None` | | `usage` | `RunUsage | None` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | `None` | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | None` | Optional additional toolsets for this run. | `None` | | `event_stream_handler` | `EventStreamHandler[AgentDepsT] | None` | Optional handler for events from the model's streaming response and the agent's execution of tools to use for this run. | `None` | Returns: | Type | Description | | --- | --- | | `AgentRunResult[Any]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/agent/abstract.py` ````python async def run( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[Any]: """Run the agent with a user prompt in async mode. This method builds an internal agent graph (using system prompts, tools and output schemas) and then runs the graph to completion. The result of the run is returned. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): agent_run = await agent.run('What is the capital of France?') print(agent_run.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. event_stream_handler: Optional handler for events from the model's streaming response and the agent's execution of tools to use for this run. Returns: The result of the run. """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) event_stream_handler = event_stream_handler or self.event_stream_handler async with self.iter( user_prompt=user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, toolsets=toolsets, ) as agent_run: async for node in agent_run: if event_stream_handler is not None and ( self.is_model_request_node(node) or self.is_call_tools_node(node) ): async with node.stream(agent_run.ctx) as stream: await event_stream_handler(_agent_graph.build_run_context(agent_run.ctx), stream) assert agent_run.result is not None, 'The graph run did not finish properly' return agent_run.result ```` #### run_sync ```python run_sync( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: list[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AgentRunResult[OutputDataT] ``` ```python run_sync( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: list[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AgentRunResult[RunOutputDataT] ``` ```python run_sync( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: list[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AgentRunResult[Any] ``` Synchronously run the agent with a user prompt. This is a convenience method that wraps self.run with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') result_sync = agent.run_sync('What is the capital of Italy?') print(result_sync.output) #> The capital of Italy is Rome. ``` Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `user_prompt` | `str | Sequence[UserContent] | None` | User input to start/continue the conversation. | `None` | | `output_type` | `OutputSpec[RunOutputDataT] | None` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | `None` | | `message_history` | `list[ModelMessage] | None` | History of the conversation so far. | `None` | | `deferred_tool_results` | `DeferredToolResults | None` | Optional results for deferred tool calls in the message history. | `None` | | `model` | `Model | KnownModelName | str | None` | Optional model to use for this run, required if model was not set when creating the agent. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | `ModelSettings | None` | Optional settings to use for this model's request. | `None` | | `usage_limits` | `UsageLimits | None` | Optional limits on model request count or token usage. | `None` | | `usage` | `RunUsage | None` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | `None` | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | None` | Optional additional toolsets for this run. | `None` | | `event_stream_handler` | `EventStreamHandler[AgentDepsT] | None` | Optional handler for events from the model's streaming response and the agent's execution of tools to use for this run. | `None` | Returns: | Type | Description | | --- | --- | | `AgentRunResult[Any]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/agent/abstract.py` ````python def run_sync( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[Any]: """Synchronously run the agent with a user prompt. This is a convenience method that wraps [`self.run`][pydantic_ai.agent.AbstractAgent.run] with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') result_sync = agent.run_sync('What is the capital of Italy?') print(result_sync.output) #> The capital of Italy is Rome. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. event_stream_handler: Optional handler for events from the model's streaming response and the agent's execution of tools to use for this run. Returns: The result of the run. """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) return get_event_loop().run_until_complete( self.run( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=False, toolsets=toolsets, event_stream_handler=event_stream_handler, ) ) ```` #### run_stream ```python run_stream( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: list[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AbstractAsyncContextManager[ StreamedRunResult[AgentDepsT, OutputDataT] ] ``` ```python run_stream( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: list[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AbstractAsyncContextManager[ StreamedRunResult[AgentDepsT, RunOutputDataT] ] ``` ```python run_stream( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: list[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AsyncIterator[StreamedRunResult[AgentDepsT, Any]] ``` Run the agent with a user prompt in async streaming mode. This method builds an internal agent graph (using system prompts, tools and output schemas) and then runs the graph until the model produces output matching the `output_type`, for example text or structured data. At this point, a streaming run result object is yielded from which you can stream the output as it comes in, and -- once this output has completed streaming -- get the complete output, message history, and usage. As this method will consider the first output matching the `output_type` to be the final output, it will stop running the agent graph and will not execute any tool calls made by the model after this "final" output. If you want to always run the agent graph to completion and stream events and output at the same time, use agent.run() with an `event_stream_handler` or agent.iter() instead. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): async with agent.run_stream('What is the capital of the UK?') as response: print(await response.get_output()) #> The capital of the UK is London. ``` Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `user_prompt` | `str | Sequence[UserContent] | None` | User input to start/continue the conversation. | `None` | | `output_type` | `OutputSpec[RunOutputDataT] | None` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | `None` | | `message_history` | `list[ModelMessage] | None` | History of the conversation so far. | `None` | | `deferred_tool_results` | `DeferredToolResults | None` | Optional results for deferred tool calls in the message history. | `None` | | `model` | `Model | KnownModelName | str | None` | Optional model to use for this run, required if model was not set when creating the agent. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | `ModelSettings | None` | Optional settings to use for this model's request. | `None` | | `usage_limits` | `UsageLimits | None` | Optional limits on model request count or token usage. | `None` | | `usage` | `RunUsage | None` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | `None` | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | None` | Optional additional toolsets for this run. | `None` | | `event_stream_handler` | `EventStreamHandler[AgentDepsT] | None` | Optional handler for events from the model's streaming response and the agent's execution of tools to use for this run. It will receive all the events up until the final result is found, which you can then read or stream from inside the context manager. Note that it does not receive any events after the final result is found. | `None` | Returns: | Type | Description | | --- | --- | | `AsyncIterator[StreamedRunResult[AgentDepsT, Any]]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/agent/abstract.py` ````python @asynccontextmanager async def run_stream( # noqa C901 self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AsyncIterator[result.StreamedRunResult[AgentDepsT, Any]]: """Run the agent with a user prompt in async streaming mode. This method builds an internal agent graph (using system prompts, tools and output schemas) and then runs the graph until the model produces output matching the `output_type`, for example text or structured data. At this point, a streaming run result object is yielded from which you can stream the output as it comes in, and -- once this output has completed streaming -- get the complete output, message history, and usage. As this method will consider the first output matching the `output_type` to be the final output, it will stop running the agent graph and will not execute any tool calls made by the model after this "final" output. If you want to always run the agent graph to completion and stream events and output at the same time, use [`agent.run()`][pydantic_ai.agent.AbstractAgent.run] with an `event_stream_handler` or [`agent.iter()`][pydantic_ai.agent.AbstractAgent.iter] instead. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): async with agent.run_stream('What is the capital of the UK?') as response: print(await response.get_output()) #> The capital of the UK is London. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. event_stream_handler: Optional handler for events from the model's streaming response and the agent's execution of tools to use for this run. It will receive all the events up until the final result is found, which you can then read or stream from inside the context manager. Note that it does _not_ receive any events after the final result is found. Returns: The result of the run. """ if infer_name and self.name is None: # f_back because `asynccontextmanager` adds one frame if frame := inspect.currentframe(): # pragma: no branch self._infer_name(frame.f_back) event_stream_handler = event_stream_handler or self.event_stream_handler yielded = False async with self.iter( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=False, toolsets=toolsets, ) as agent_run: first_node = agent_run.next_node # start with the first node assert isinstance(first_node, _agent_graph.UserPromptNode) # the first node should be a user prompt node node = first_node while True: graph_ctx = agent_run.ctx if self.is_model_request_node(node): async with node.stream(graph_ctx) as stream: final_result_event = None async def stream_to_final( stream: AgentStream, ) -> AsyncIterator[_messages.ModelResponseStreamEvent]: nonlocal final_result_event async for event in stream: yield event if isinstance(event, _messages.FinalResultEvent): final_result_event = event break if event_stream_handler is not None: await event_stream_handler( _agent_graph.build_run_context(graph_ctx), stream_to_final(stream) ) else: async for _ in stream_to_final(stream): pass if final_result_event is not None: final_result = FinalResult( stream, final_result_event.tool_name, final_result_event.tool_call_id ) if yielded: raise exceptions.AgentRunError('Agent run produced final results') # pragma: no cover yielded = True messages = graph_ctx.state.message_history.copy() async def on_complete() -> None: """Called when the stream has completed. The model response will have been added to messages by now by `StreamedRunResult._marked_completed`. """ last_message = messages[-1] assert isinstance(last_message, _messages.ModelResponse) tool_calls = [ part for part in last_message.parts if isinstance(part, _messages.ToolCallPart) ] parts: list[_messages.ModelRequestPart] = [] async for _event in _agent_graph.process_function_tools( graph_ctx.deps.tool_manager, tool_calls, final_result, graph_ctx, parts, ): pass if parts: messages.append(_messages.ModelRequest(parts)) yield StreamedRunResult( messages, graph_ctx.deps.new_message_index, stream, on_complete, ) break elif self.is_call_tools_node(node) and event_stream_handler is not None: async with node.stream(agent_run.ctx) as stream: await event_stream_handler(_agent_graph.build_run_context(agent_run.ctx), stream) next_node = await agent_run.next(node) if isinstance(next_node, End) and agent_run.result is not None: # A final output could have been produced by the CallToolsNode rather than the ModelRequestNode, # if a tool function raised CallDeferred or ApprovalRequired. # In this case there's no response to stream, but we still let the user access the output etc as normal. yield StreamedRunResult( graph_ctx.state.message_history, graph_ctx.deps.new_message_index, run_result=agent_run.result, ) yielded = True break if not isinstance(next_node, _agent_graph.AgentNode): raise exceptions.AgentRunError( # pragma: no cover 'Should have produced a StreamedRunResult before getting here' ) node = cast(_agent_graph.AgentNode[Any, Any], next_node) if not yielded: raise exceptions.AgentRunError('Agent run finished without producing a final result') # pragma: no cover ```` #### iter ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: list[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None ) -> AbstractAsyncContextManager[ AgentRun[AgentDepsT, OutputDataT] ] ``` ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: list[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None ) -> AbstractAsyncContextManager[ AgentRun[AgentDepsT, RunOutputDataT] ] ``` ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: list[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None ) -> AsyncIterator[AgentRun[AgentDepsT, Any]] ``` A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions=None, instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `user_prompt` | `str | Sequence[UserContent] | None` | User input to start/continue the conversation. | `None` | | `output_type` | `OutputSpec[RunOutputDataT] | None` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | `None` | | `message_history` | `list[ModelMessage] | None` | History of the conversation so far. | `None` | | `deferred_tool_results` | `DeferredToolResults | None` | Optional results for deferred tool calls in the message history. | `None` | | `model` | `Model | KnownModelName | str | None` | Optional model to use for this run, required if model was not set when creating the agent. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | `ModelSettings | None` | Optional settings to use for this model's request. | `None` | | `usage_limits` | `UsageLimits | None` | Optional limits on model request count or token usage. | `None` | | `usage` | `RunUsage | None` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | `None` | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | None` | Optional additional toolsets for this run. | `None` | Returns: | Type | Description | | --- | --- | | `AsyncIterator[AgentRun[AgentDepsT, Any]]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/agent/abstract.py` ````python @asynccontextmanager @abstractmethod async def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, ) -> AsyncIterator[AgentRun[AgentDepsT, Any]]: """A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions=None, instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. Returns: The result of the run. """ raise NotImplementedError yield ```` #### override ```python override( *, deps: AgentDepsT | Unset = UNSET, model: Model | KnownModelName | str | Unset = UNSET, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | Unset ) = UNSET, tools: ( Sequence[ Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...] ] | Unset ) = UNSET ) -> Iterator[None] ``` Context manager to temporarily override agent dependencies, model, toolsets, or tools. This is particularly useful when testing. You can find an example of this [here](../../testing/#overriding-model-via-pytest-fixtures). Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `deps` | `AgentDepsT | Unset` | The dependencies to use instead of the dependencies passed to the agent run. | `UNSET` | | `model` | `Model | KnownModelName | str | Unset` | The model to use instead of the model passed to the agent run. | `UNSET` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | Unset` | The toolsets to use instead of the toolsets passed to the agent constructor and agent run. | `UNSET` | | `tools` | `Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | Unset` | The tools to use instead of the tools registered with the agent. | `UNSET` | Source code in `pydantic_ai_slim/pydantic_ai/agent/abstract.py` ```python @contextmanager @abstractmethod def override( self, *, deps: AgentDepsT | _utils.Unset = _utils.UNSET, model: models.Model | models.KnownModelName | str | _utils.Unset = _utils.UNSET, toolsets: Sequence[AbstractToolset[AgentDepsT]] | _utils.Unset = _utils.UNSET, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | _utils.Unset = _utils.UNSET, ) -> Iterator[None]: """Context manager to temporarily override agent dependencies, model, toolsets, or tools. This is particularly useful when testing. You can find an example of this [here](../testing.md#overriding-model-via-pytest-fixtures). Args: deps: The dependencies to use instead of the dependencies passed to the agent run. model: The model to use instead of the model passed to the agent run. toolsets: The toolsets to use instead of the toolsets passed to the agent constructor and agent run. tools: The tools to use instead of the tools registered with the agent. """ raise NotImplementedError yield ``` #### sequential_tool_calls ```python sequential_tool_calls() -> Iterator[None] ``` Run tool calls sequentially during the context. Source code in `pydantic_ai_slim/pydantic_ai/agent/abstract.py` ```python @staticmethod @contextmanager def sequential_tool_calls() -> Iterator[None]: """Run tool calls sequentially during the context.""" with ToolManager.sequential_tool_calls(): yield ``` #### is_model_request_node ```python is_model_request_node( node: AgentNode[T, S] | End[FinalResult[S]], ) -> TypeIs[ModelRequestNode[T, S]] ``` Check if the node is a `ModelRequestNode`, narrowing the type if it is. This method preserves the generic parameters while narrowing the type, unlike a direct call to `isinstance`. Source code in `pydantic_ai_slim/pydantic_ai/agent/abstract.py` ```python @staticmethod def is_model_request_node( node: _agent_graph.AgentNode[T, S] | End[result.FinalResult[S]], ) -> TypeIs[_agent_graph.ModelRequestNode[T, S]]: """Check if the node is a `ModelRequestNode`, narrowing the type if it is. This method preserves the generic parameters while narrowing the type, unlike a direct call to `isinstance`. """ return isinstance(node, _agent_graph.ModelRequestNode) ``` #### is_call_tools_node ```python is_call_tools_node( node: AgentNode[T, S] | End[FinalResult[S]], ) -> TypeIs[CallToolsNode[T, S]] ``` Check if the node is a `CallToolsNode`, narrowing the type if it is. This method preserves the generic parameters while narrowing the type, unlike a direct call to `isinstance`. Source code in `pydantic_ai_slim/pydantic_ai/agent/abstract.py` ```python @staticmethod def is_call_tools_node( node: _agent_graph.AgentNode[T, S] | End[result.FinalResult[S]], ) -> TypeIs[_agent_graph.CallToolsNode[T, S]]: """Check if the node is a `CallToolsNode`, narrowing the type if it is. This method preserves the generic parameters while narrowing the type, unlike a direct call to `isinstance`. """ return isinstance(node, _agent_graph.CallToolsNode) ``` #### is_user_prompt_node ```python is_user_prompt_node( node: AgentNode[T, S] | End[FinalResult[S]], ) -> TypeIs[UserPromptNode[T, S]] ``` Check if the node is a `UserPromptNode`, narrowing the type if it is. This method preserves the generic parameters while narrowing the type, unlike a direct call to `isinstance`. Source code in `pydantic_ai_slim/pydantic_ai/agent/abstract.py` ```python @staticmethod def is_user_prompt_node( node: _agent_graph.AgentNode[T, S] | End[result.FinalResult[S]], ) -> TypeIs[_agent_graph.UserPromptNode[T, S]]: """Check if the node is a `UserPromptNode`, narrowing the type if it is. This method preserves the generic parameters while narrowing the type, unlike a direct call to `isinstance`. """ return isinstance(node, _agent_graph.UserPromptNode) ``` #### is_end_node ```python is_end_node( node: AgentNode[T, S] | End[FinalResult[S]], ) -> TypeIs[End[FinalResult[S]]] ``` Check if the node is a `End`, narrowing the type if it is. This method preserves the generic parameters while narrowing the type, unlike a direct call to `isinstance`. Source code in `pydantic_ai_slim/pydantic_ai/agent/abstract.py` ```python @staticmethod def is_end_node( node: _agent_graph.AgentNode[T, S] | End[result.FinalResult[S]], ) -> TypeIs[End[result.FinalResult[S]]]: """Check if the node is a `End`, narrowing the type if it is. This method preserves the generic parameters while narrowing the type, unlike a direct call to `isinstance`. """ return isinstance(node, End) ``` #### to_ag_ui ```python to_ag_ui( *, output_type: OutputSpec[OutputDataT] | None = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, debug: bool = False, routes: Sequence[BaseRoute] | None = None, middleware: Sequence[Middleware] | None = None, exception_handlers: ( Mapping[Any, ExceptionHandler] | None ) = None, on_startup: Sequence[Callable[[], Any]] | None = None, on_shutdown: Sequence[Callable[[], Any]] | None = None, lifespan: ( Lifespan[AGUIApp[AgentDepsT, OutputDataT]] | None ) = None ) -> AGUIApp[AgentDepsT, OutputDataT] ``` Returns an ASGI application that handles every AG-UI request by running the agent. Note that the `deps` will be the same for each request, with the exception of the AG-UI state that's injected into the `state` field of a `deps` object that implements the StateHandler protocol. To provide different `deps` for each request (e.g. based on the authenticated user), use pydantic_ai.ag_ui.run_ag_ui or pydantic_ai.ag_ui.handle_ag_ui_request instead. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') app = agent.to_ag_ui() ``` The `app` is an ASGI application that can be used with any ASGI server. To run the application, you can use the following command: ```bash uvicorn app:app --host 0.0.0.0 --port 8000 ``` See [AG-UI docs](../../ag-ui/) for more information. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `output_type` | `OutputSpec[OutputDataT] | None` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | `None` | | `model` | `Model | KnownModelName | str | None` | Optional model to use for this run, required if model was not set when creating the agent. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | `ModelSettings | None` | Optional settings to use for this model's request. | `None` | | `usage_limits` | `UsageLimits | None` | Optional limits on model request count or token usage. | `None` | | `usage` | `RunUsage | None` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | `None` | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | None` | Optional additional toolsets for this run. | `None` | | `debug` | `bool` | Boolean indicating if debug tracebacks should be returned on errors. | `False` | | `routes` | `Sequence[BaseRoute] | None` | A list of routes to serve incoming HTTP and WebSocket requests. | `None` | | `middleware` | `Sequence[Middleware] | None` | A list of middleware to run for every request. A starlette application will always automatically include two middleware classes. ServerErrorMiddleware is added as the very outermost middleware, to handle any uncaught errors occurring anywhere in the entire stack. ExceptionMiddleware is added as the very innermost middleware, to deal with handled exception cases occurring in the routing or endpoints. | `None` | | `exception_handlers` | `Mapping[Any, ExceptionHandler] | None` | A mapping of either integer status codes, or exception class types onto callables which handle the exceptions. Exception handler callables should be of the form handler(request, exc) -> response and may be either standard functions, or async functions. | `None` | | `on_startup` | `Sequence[Callable[[], Any]] | None` | A list of callables to run on application startup. Startup handler callables do not take any arguments, and may be either standard functions, or async functions. | `None` | | `on_shutdown` | `Sequence[Callable[[], Any]] | None` | A list of callables to run on application shutdown. Shutdown handler callables do not take any arguments, and may be either standard functions, or async functions. | `None` | | `lifespan` | `Lifespan[AGUIApp[AgentDepsT, OutputDataT]] | None` | A lifespan context function, which can be used to perform startup and shutdown tasks. This is a newer style that replaces the on_startup and on_shutdown handlers. Use one or the other, not both. | `None` | Returns: | Type | Description | | --- | --- | | `AGUIApp[AgentDepsT, OutputDataT]` | An ASGI application for running Pydantic AI agents with AG-UI protocol support. | Source code in `pydantic_ai_slim/pydantic_ai/agent/abstract.py` ````python def to_ag_ui( self, *, # Agent.iter parameters output_type: OutputSpec[OutputDataT] | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, # Starlette debug: bool = False, routes: Sequence[BaseRoute] | None = None, middleware: Sequence[Middleware] | None = None, exception_handlers: Mapping[Any, ExceptionHandler] | None = None, on_startup: Sequence[Callable[[], Any]] | None = None, on_shutdown: Sequence[Callable[[], Any]] | None = None, lifespan: Lifespan[AGUIApp[AgentDepsT, OutputDataT]] | None = None, ) -> AGUIApp[AgentDepsT, OutputDataT]: """Returns an ASGI application that handles every AG-UI request by running the agent. Note that the `deps` will be the same for each request, with the exception of the AG-UI state that's injected into the `state` field of a `deps` object that implements the [`StateHandler`][pydantic_ai.ag_ui.StateHandler] protocol. To provide different `deps` for each request (e.g. based on the authenticated user), use [`pydantic_ai.ag_ui.run_ag_ui`][pydantic_ai.ag_ui.run_ag_ui] or [`pydantic_ai.ag_ui.handle_ag_ui_request`][pydantic_ai.ag_ui.handle_ag_ui_request] instead. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') app = agent.to_ag_ui() ``` The `app` is an ASGI application that can be used with any ASGI server. To run the application, you can use the following command: ```bash uvicorn app:app --host 0.0.0.0 --port 8000 ``` See [AG-UI docs](../ag-ui.md) for more information. Args: output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. debug: Boolean indicating if debug tracebacks should be returned on errors. routes: A list of routes to serve incoming HTTP and WebSocket requests. middleware: A list of middleware to run for every request. A starlette application will always automatically include two middleware classes. `ServerErrorMiddleware` is added as the very outermost middleware, to handle any uncaught errors occurring anywhere in the entire stack. `ExceptionMiddleware` is added as the very innermost middleware, to deal with handled exception cases occurring in the routing or endpoints. exception_handlers: A mapping of either integer status codes, or exception class types onto callables which handle the exceptions. Exception handler callables should be of the form `handler(request, exc) -> response` and may be either standard functions, or async functions. on_startup: A list of callables to run on application startup. Startup handler callables do not take any arguments, and may be either standard functions, or async functions. on_shutdown: A list of callables to run on application shutdown. Shutdown handler callables do not take any arguments, and may be either standard functions, or async functions. lifespan: A lifespan context function, which can be used to perform startup and shutdown tasks. This is a newer style that replaces the `on_startup` and `on_shutdown` handlers. Use one or the other, not both. Returns: An ASGI application for running Pydantic AI agents with AG-UI protocol support. """ from ..ag_ui import AGUIApp return AGUIApp( agent=self, # Agent.iter parameters output_type=output_type, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, # Starlette debug=debug, routes=routes, middleware=middleware, exception_handlers=exception_handlers, on_startup=on_startup, on_shutdown=on_shutdown, lifespan=lifespan, ) ```` #### to_a2a ```python to_a2a( *, storage: Storage | None = None, broker: Broker | None = None, name: str | None = None, url: str = "http://localhost:8000", version: str = "1.0.0", description: str | None = None, provider: AgentProvider | None = None, skills: list[Skill] | None = None, debug: bool = False, routes: Sequence[Route] | None = None, middleware: Sequence[Middleware] | None = None, exception_handlers: ( dict[Any, ExceptionHandler] | None ) = None, lifespan: Lifespan[FastA2A] | None = None ) -> FastA2A ``` Convert the agent to a FastA2A application. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') app = agent.to_a2a() ``` The `app` is an ASGI application that can be used with any ASGI server. To run the application, you can use the following command: ```bash uvicorn app:app --host 0.0.0.0 --port 8000 ``` Source code in `pydantic_ai_slim/pydantic_ai/agent/abstract.py` ````python def to_a2a( self, *, storage: Storage | None = None, broker: Broker | None = None, # Agent card name: str | None = None, url: str = 'http://localhost:8000', version: str = '1.0.0', description: str | None = None, provider: AgentProvider | None = None, skills: list[Skill] | None = None, # Starlette debug: bool = False, routes: Sequence[Route] | None = None, middleware: Sequence[Middleware] | None = None, exception_handlers: dict[Any, ExceptionHandler] | None = None, lifespan: Lifespan[FastA2A] | None = None, ) -> FastA2A: """Convert the agent to a FastA2A application. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') app = agent.to_a2a() ``` The `app` is an ASGI application that can be used with any ASGI server. To run the application, you can use the following command: ```bash uvicorn app:app --host 0.0.0.0 --port 8000 ``` """ from .._a2a import agent_to_a2a return agent_to_a2a( self, storage=storage, broker=broker, name=name, url=url, version=version, description=description, provider=provider, skills=skills, debug=debug, routes=routes, middleware=middleware, exception_handlers=exception_handlers, lifespan=lifespan, ) ```` #### to_cli ```python to_cli( deps: AgentDepsT = None, prog_name: str = "pydantic-ai", message_history: list[ModelMessage] | None = None, ) -> None ``` Run the agent in a CLI chat interface. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `deps` | `AgentDepsT` | The dependencies to pass to the agent. | `None` | | `prog_name` | `str` | The name of the program to use for the CLI. Defaults to 'pydantic-ai'. | `'pydantic-ai'` | | `message_history` | `list[ModelMessage] | None` | History of the conversation so far. | `None` | Example: agent_to_cli.py ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o', instructions='You always respond in Italian.') async def main(): await agent.to_cli() ``` Source code in `pydantic_ai_slim/pydantic_ai/agent/abstract.py` ````python async def to_cli( self: Self, deps: AgentDepsT = None, prog_name: str = 'pydantic-ai', message_history: list[_messages.ModelMessage] | None = None, ) -> None: """Run the agent in a CLI chat interface. Args: deps: The dependencies to pass to the agent. prog_name: The name of the program to use for the CLI. Defaults to 'pydantic-ai'. message_history: History of the conversation so far. Example: ```python {title="agent_to_cli.py" test="skip"} from pydantic_ai import Agent agent = Agent('openai:gpt-4o', instructions='You always respond in Italian.') async def main(): await agent.to_cli() ``` """ from rich.console import Console from pydantic_ai._cli import run_chat await run_chat( stream=True, agent=self, deps=deps, console=Console(), code_theme='monokai', prog_name=prog_name, message_history=message_history, ) ```` #### to_cli_sync ```python to_cli_sync( deps: AgentDepsT = None, prog_name: str = "pydantic-ai", message_history: list[ModelMessage] | None = None, ) -> None ``` Run the agent in a CLI chat interface with the non-async interface. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `deps` | `AgentDepsT` | The dependencies to pass to the agent. | `None` | | `prog_name` | `str` | The name of the program to use for the CLI. Defaults to 'pydantic-ai'. | `'pydantic-ai'` | | `message_history` | `list[ModelMessage] | None` | History of the conversation so far. | `None` | agent_to_cli_sync.py ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o', instructions='You always respond in Italian.') agent.to_cli_sync() agent.to_cli_sync(prog_name='assistant') ``` Source code in `pydantic_ai_slim/pydantic_ai/agent/abstract.py` ````python def to_cli_sync( self: Self, deps: AgentDepsT = None, prog_name: str = 'pydantic-ai', message_history: list[_messages.ModelMessage] | None = None, ) -> None: """Run the agent in a CLI chat interface with the non-async interface. Args: deps: The dependencies to pass to the agent. prog_name: The name of the program to use for the CLI. Defaults to 'pydantic-ai'. message_history: History of the conversation so far. ```python {title="agent_to_cli_sync.py" test="skip"} from pydantic_ai import Agent agent = Agent('openai:gpt-4o', instructions='You always respond in Italian.') agent.to_cli_sync() agent.to_cli_sync(prog_name='assistant') ``` """ return get_event_loop().run_until_complete( self.to_cli(deps=deps, prog_name=prog_name, message_history=message_history) ) ```` ### WrapperAgent Bases: `AbstractAgent[AgentDepsT, OutputDataT]` Agent which wraps another agent. Does nothing on its own, used as a base class. Source code in `pydantic_ai_slim/pydantic_ai/agent/wrapper.py` ````python class WrapperAgent(AbstractAgent[AgentDepsT, OutputDataT]): """Agent which wraps another agent. Does nothing on its own, used as a base class. """ def __init__(self, wrapped: AbstractAgent[AgentDepsT, OutputDataT]): self.wrapped = wrapped @property def model(self) -> models.Model | models.KnownModelName | str | None: return self.wrapped.model @property def name(self) -> str | None: return self.wrapped.name @name.setter def name(self, value: str | None) -> None: self.wrapped.name = value @property def deps_type(self) -> type: return self.wrapped.deps_type @property def output_type(self) -> OutputSpec[OutputDataT]: return self.wrapped.output_type @property def event_stream_handler(self) -> EventStreamHandler[AgentDepsT] | None: return self.wrapped.event_stream_handler @property def toolsets(self) -> Sequence[AbstractToolset[AgentDepsT]]: return self.wrapped.toolsets async def __aenter__(self) -> AbstractAgent[AgentDepsT, OutputDataT]: return await self.wrapped.__aenter__() async def __aexit__(self, *args: Any) -> bool | None: return await self.wrapped.__aexit__(*args) @overload def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, ) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, OutputDataT]]: ... @overload def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, ) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, RunOutputDataT]]: ... @asynccontextmanager async def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, ) -> AsyncIterator[AgentRun[AgentDepsT, Any]]: """A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions=None, instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. Returns: The result of the run. """ async with self.wrapped.iter( user_prompt=user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, ) as run: yield run @contextmanager def override( self, *, deps: AgentDepsT | _utils.Unset = _utils.UNSET, model: models.Model | models.KnownModelName | str | _utils.Unset = _utils.UNSET, toolsets: Sequence[AbstractToolset[AgentDepsT]] | _utils.Unset = _utils.UNSET, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | _utils.Unset = _utils.UNSET, ) -> Iterator[None]: """Context manager to temporarily override agent dependencies, model, toolsets, or tools. This is particularly useful when testing. You can find an example of this [here](../testing.md#overriding-model-via-pytest-fixtures). Args: deps: The dependencies to use instead of the dependencies passed to the agent run. model: The model to use instead of the model passed to the agent run. toolsets: The toolsets to use instead of the toolsets passed to the agent constructor and agent run. tools: The tools to use instead of the tools registered with the agent. """ with self.wrapped.override(deps=deps, model=model, toolsets=toolsets, tools=tools): yield ```` #### iter ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: list[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None ) -> AbstractAsyncContextManager[ AgentRun[AgentDepsT, OutputDataT] ] ``` ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: list[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None ) -> AbstractAsyncContextManager[ AgentRun[AgentDepsT, RunOutputDataT] ] ``` ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: list[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None ) -> AsyncIterator[AgentRun[AgentDepsT, Any]] ``` A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions=None, instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `user_prompt` | `str | Sequence[UserContent] | None` | User input to start/continue the conversation. | `None` | | `output_type` | `OutputSpec[RunOutputDataT] | None` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | `None` | | `message_history` | `list[ModelMessage] | None` | History of the conversation so far. | `None` | | `deferred_tool_results` | `DeferredToolResults | None` | Optional results for deferred tool calls in the message history. | `None` | | `model` | `Model | KnownModelName | str | None` | Optional model to use for this run, required if model was not set when creating the agent. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | `ModelSettings | None` | Optional settings to use for this model's request. | `None` | | `usage_limits` | `UsageLimits | None` | Optional limits on model request count or token usage. | `None` | | `usage` | `RunUsage | None` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | `None` | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | None` | Optional additional toolsets for this run. | `None` | Returns: | Type | Description | | --- | --- | | `AsyncIterator[AgentRun[AgentDepsT, Any]]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/agent/wrapper.py` ````python @asynccontextmanager async def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, ) -> AsyncIterator[AgentRun[AgentDepsT, Any]]: """A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions=None, instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. Returns: The result of the run. """ async with self.wrapped.iter( user_prompt=user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, ) as run: yield run ```` #### override ```python override( *, deps: AgentDepsT | Unset = UNSET, model: Model | KnownModelName | str | Unset = UNSET, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | Unset ) = UNSET, tools: ( Sequence[ Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...] ] | Unset ) = UNSET ) -> Iterator[None] ``` Context manager to temporarily override agent dependencies, model, toolsets, or tools. This is particularly useful when testing. You can find an example of this [here](../../testing/#overriding-model-via-pytest-fixtures). Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `deps` | `AgentDepsT | Unset` | The dependencies to use instead of the dependencies passed to the agent run. | `UNSET` | | `model` | `Model | KnownModelName | str | Unset` | The model to use instead of the model passed to the agent run. | `UNSET` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | Unset` | The toolsets to use instead of the toolsets passed to the agent constructor and agent run. | `UNSET` | | `tools` | `Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | Unset` | The tools to use instead of the tools registered with the agent. | `UNSET` | Source code in `pydantic_ai_slim/pydantic_ai/agent/wrapper.py` ```python @contextmanager def override( self, *, deps: AgentDepsT | _utils.Unset = _utils.UNSET, model: models.Model | models.KnownModelName | str | _utils.Unset = _utils.UNSET, toolsets: Sequence[AbstractToolset[AgentDepsT]] | _utils.Unset = _utils.UNSET, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | _utils.Unset = _utils.UNSET, ) -> Iterator[None]: """Context manager to temporarily override agent dependencies, model, toolsets, or tools. This is particularly useful when testing. You can find an example of this [here](../testing.md#overriding-model-via-pytest-fixtures). Args: deps: The dependencies to use instead of the dependencies passed to the agent run. model: The model to use instead of the model passed to the agent run. toolsets: The toolsets to use instead of the toolsets passed to the agent constructor and agent run. tools: The tools to use instead of the tools registered with the agent. """ with self.wrapped.override(deps=deps, model=model, toolsets=toolsets, tools=tools): yield ``` ### AgentRun Bases: `Generic[AgentDepsT, OutputDataT]` A stateful, async-iterable run of an Agent. You generally obtain an `AgentRun` instance by calling `async with my_agent.iter(...) as agent_run:`. Once you have an instance, you can use it to iterate through the run's nodes as they execute. When an End is reached, the run finishes and result becomes available. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): nodes = [] # Iterate through the run, recording each node along the way: async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions=None, instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` You can also manually drive the iteration using the next method for more granular control. Source code in `pydantic_ai_slim/pydantic_ai/run.py` ````python @dataclasses.dataclass(repr=False) class AgentRun(Generic[AgentDepsT, OutputDataT]): """A stateful, async-iterable run of an [`Agent`][pydantic_ai.agent.Agent]. You generally obtain an `AgentRun` instance by calling `async with my_agent.iter(...) as agent_run:`. Once you have an instance, you can use it to iterate through the run's nodes as they execute. When an [`End`][pydantic_graph.nodes.End] is reached, the run finishes and [`result`][pydantic_ai.agent.AgentRun.result] becomes available. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): nodes = [] # Iterate through the run, recording each node along the way: async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions=None, instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` You can also manually drive the iteration using the [`next`][pydantic_ai.agent.AgentRun.next] method for more granular control. """ _graph_run: GraphRun[ _agent_graph.GraphAgentState, _agent_graph.GraphAgentDeps[AgentDepsT, Any], FinalResult[OutputDataT] ] @overload def _traceparent(self, *, required: Literal[False]) -> str | None: ... @overload def _traceparent(self) -> str: ... def _traceparent(self, *, required: bool = True) -> str | None: traceparent = self._graph_run._traceparent(required=False) # type: ignore[reportPrivateUsage] if traceparent is None and required: # pragma: no cover raise AttributeError('No span was created for this agent run') return traceparent @property def ctx(self) -> GraphRunContext[_agent_graph.GraphAgentState, _agent_graph.GraphAgentDeps[AgentDepsT, Any]]: """The current context of the agent run.""" return GraphRunContext[_agent_graph.GraphAgentState, _agent_graph.GraphAgentDeps[AgentDepsT, Any]]( state=self._graph_run.state, deps=self._graph_run.deps ) @property def next_node( self, ) -> _agent_graph.AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]: """The next node that will be run in the agent graph. This is the next node that will be used during async iteration, or if a node is not passed to `self.next(...)`. """ next_node = self._graph_run.next_node if isinstance(next_node, End): return next_node if _agent_graph.is_agent_node(next_node): return next_node raise exceptions.AgentRunError(f'Unexpected node type: {type(next_node)}') # pragma: no cover @property def result(self) -> AgentRunResult[OutputDataT] | None: """The final result of the run if it has ended, otherwise `None`. Once the run returns an [`End`][pydantic_graph.nodes.End] node, `result` is populated with an [`AgentRunResult`][pydantic_ai.agent.AgentRunResult]. """ graph_run_result = self._graph_run.result if graph_run_result is None: return None return AgentRunResult( graph_run_result.output.output, graph_run_result.output.tool_name, graph_run_result.state, self._graph_run.deps.new_message_index, self._traceparent(required=False), ) def __aiter__( self, ) -> AsyncIterator[_agent_graph.AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]]: """Provide async-iteration over the nodes in the agent run.""" return self async def __anext__( self, ) -> _agent_graph.AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]: """Advance to the next node automatically based on the last returned node.""" next_node = await self._graph_run.__anext__() if _agent_graph.is_agent_node(node=next_node): return next_node assert isinstance(next_node, End), f'Unexpected node type: {type(next_node)}' return next_node async def next( self, node: _agent_graph.AgentNode[AgentDepsT, OutputDataT], ) -> _agent_graph.AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]: """Manually drive the agent run by passing in the node you want to run next. This lets you inspect or mutate the node before continuing execution, or skip certain nodes under dynamic conditions. The agent run should be stopped when you return an [`End`][pydantic_graph.nodes.End] node. Example: ```python from pydantic_ai import Agent from pydantic_graph import End agent = Agent('openai:gpt-4o') async def main(): async with agent.iter('What is the capital of France?') as agent_run: next_node = agent_run.next_node # start with the first node nodes = [next_node] while not isinstance(next_node, End): next_node = await agent_run.next(next_node) nodes.append(next_node) # Once `next_node` is an End, we've finished: print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions=None, instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print('Final result:', agent_run.result.output) #> Final result: The capital of France is Paris. ``` Args: node: The node to run next in the graph. Returns: The next node returned by the graph logic, or an [`End`][pydantic_graph.nodes.End] node if the run has completed. """ # Note: It might be nice to expose a synchronous interface for iteration, but we shouldn't do it # on this class, or else IDEs won't warn you if you accidentally use `for` instead of `async for` to iterate. next_node = await self._graph_run.next(node) if _agent_graph.is_agent_node(next_node): return next_node assert isinstance(next_node, End), f'Unexpected node type: {type(next_node)}' return next_node def usage(self) -> _usage.RunUsage: """Get usage statistics for the run so far, including token usage, model requests, and so on.""" return self._graph_run.state.usage def __repr__(self) -> str: # pragma: no cover result = self._graph_run.result result_repr = '' if result is None else repr(result.output) return f'<{type(self).__name__} result={result_repr} usage={self.usage()}>' ```` #### ctx ```python ctx: GraphRunContext[ GraphAgentState, GraphAgentDeps[AgentDepsT, Any] ] ``` The current context of the agent run. #### next_node ```python next_node: ( AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]] ) ``` The next node that will be run in the agent graph. This is the next node that will be used during async iteration, or if a node is not passed to `self.next(...)`. #### result ```python result: AgentRunResult[OutputDataT] | None ``` The final result of the run if it has ended, otherwise `None`. Once the run returns an End node, `result` is populated with an AgentRunResult. #### __aiter__ ```python __aiter__() -> ( AsyncIterator[ AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]] ] ) ``` Provide async-iteration over the nodes in the agent run. Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python def __aiter__( self, ) -> AsyncIterator[_agent_graph.AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]]: """Provide async-iteration over the nodes in the agent run.""" return self ``` #### __anext__ ```python __anext__() -> ( AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]] ) ``` Advance to the next node automatically based on the last returned node. Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python async def __anext__( self, ) -> _agent_graph.AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]: """Advance to the next node automatically based on the last returned node.""" next_node = await self._graph_run.__anext__() if _agent_graph.is_agent_node(node=next_node): return next_node assert isinstance(next_node, End), f'Unexpected node type: {type(next_node)}' return next_node ``` #### next ```python next( node: AgentNode[AgentDepsT, OutputDataT], ) -> ( AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]] ) ``` Manually drive the agent run by passing in the node you want to run next. This lets you inspect or mutate the node before continuing execution, or skip certain nodes under dynamic conditions. The agent run should be stopped when you return an End node. Example: ```python from pydantic_ai import Agent from pydantic_graph import End agent = Agent('openai:gpt-4o') async def main(): async with agent.iter('What is the capital of France?') as agent_run: next_node = agent_run.next_node # start with the first node nodes = [next_node] while not isinstance(next_node, End): next_node = await agent_run.next(next_node) nodes.append(next_node) # Once `next_node` is an End, we've finished: print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions=None, instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print('Final result:', agent_run.result.output) #> Final result: The capital of France is Paris. ``` Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `node` | `AgentNode[AgentDepsT, OutputDataT]` | The node to run next in the graph. | *required* | Returns: | Type | Description | | --- | --- | | `AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]` | The next node returned by the graph logic, or an End node if | | `AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]` | the run has completed. | Source code in `pydantic_ai_slim/pydantic_ai/run.py` ````python async def next( self, node: _agent_graph.AgentNode[AgentDepsT, OutputDataT], ) -> _agent_graph.AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]: """Manually drive the agent run by passing in the node you want to run next. This lets you inspect or mutate the node before continuing execution, or skip certain nodes under dynamic conditions. The agent run should be stopped when you return an [`End`][pydantic_graph.nodes.End] node. Example: ```python from pydantic_ai import Agent from pydantic_graph import End agent = Agent('openai:gpt-4o') async def main(): async with agent.iter('What is the capital of France?') as agent_run: next_node = agent_run.next_node # start with the first node nodes = [next_node] while not isinstance(next_node, End): next_node = await agent_run.next(next_node) nodes.append(next_node) # Once `next_node` is an End, we've finished: print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions=None, instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print('Final result:', agent_run.result.output) #> Final result: The capital of France is Paris. ``` Args: node: The node to run next in the graph. Returns: The next node returned by the graph logic, or an [`End`][pydantic_graph.nodes.End] node if the run has completed. """ # Note: It might be nice to expose a synchronous interface for iteration, but we shouldn't do it # on this class, or else IDEs won't warn you if you accidentally use `for` instead of `async for` to iterate. next_node = await self._graph_run.next(node) if _agent_graph.is_agent_node(next_node): return next_node assert isinstance(next_node, End), f'Unexpected node type: {type(next_node)}' return next_node ```` #### usage ```python usage() -> RunUsage ``` Get usage statistics for the run so far, including token usage, model requests, and so on. Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python def usage(self) -> _usage.RunUsage: """Get usage statistics for the run so far, including token usage, model requests, and so on.""" return self._graph_run.state.usage ``` ### AgentRunResult Bases: `Generic[OutputDataT]` The final result of an agent run. Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python @dataclasses.dataclass class AgentRunResult(Generic[OutputDataT]): """The final result of an agent run.""" output: OutputDataT """The output data from the agent run.""" _output_tool_name: str | None = dataclasses.field(repr=False) _state: _agent_graph.GraphAgentState = dataclasses.field(repr=False) _new_message_index: int = dataclasses.field(repr=False) _traceparent_value: str | None = dataclasses.field(repr=False) @overload def _traceparent(self, *, required: Literal[False]) -> str | None: ... @overload def _traceparent(self) -> str: ... def _traceparent(self, *, required: bool = True) -> str | None: if self._traceparent_value is None and required: # pragma: no cover raise AttributeError('No span was created for this agent run') return self._traceparent_value def _set_output_tool_return(self, return_content: str) -> list[_messages.ModelMessage]: """Set return content for the output tool. Useful if you want to continue the conversation and want to set the response to the output tool call. """ if not self._output_tool_name: raise ValueError('Cannot set output tool return content when the return type is `str`.') messages = self._state.message_history last_message = messages[-1] for idx, part in enumerate(last_message.parts): if isinstance(part, _messages.ToolReturnPart) and part.tool_name == self._output_tool_name: # Only do deepcopy when we have to modify copied_messages = list(messages) copied_last = deepcopy(last_message) copied_last.parts[idx].content = return_content # type: ignore[misc] copied_messages[-1] = copied_last return copied_messages raise LookupError(f'No tool call found with tool name {self._output_tool_name!r}.') def all_messages(self, *, output_tool_return_content: str | None = None) -> list[_messages.ModelMessage]: """Return the history of _messages. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: List of messages. """ if output_tool_return_content is not None: return self._set_output_tool_return(output_tool_return_content) else: return self._state.message_history def all_messages_json(self, *, output_tool_return_content: str | None = None) -> bytes: """Return all messages from [`all_messages`][pydantic_ai.agent.AgentRunResult.all_messages] as JSON bytes. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: JSON bytes representing the messages. """ return _messages.ModelMessagesTypeAdapter.dump_json( self.all_messages(output_tool_return_content=output_tool_return_content) ) def new_messages(self, *, output_tool_return_content: str | None = None) -> list[_messages.ModelMessage]: """Return new messages associated with this run. Messages from older runs are excluded. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: List of new messages. """ return self.all_messages(output_tool_return_content=output_tool_return_content)[self._new_message_index :] def new_messages_json(self, *, output_tool_return_content: str | None = None) -> bytes: """Return new messages from [`new_messages`][pydantic_ai.agent.AgentRunResult.new_messages] as JSON bytes. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: JSON bytes representing the new messages. """ return _messages.ModelMessagesTypeAdapter.dump_json( self.new_messages(output_tool_return_content=output_tool_return_content) ) def usage(self) -> _usage.RunUsage: """Return the usage of the whole run.""" return self._state.usage def timestamp(self) -> datetime: """Return the timestamp of last response.""" model_response = self.all_messages()[-1] assert isinstance(model_response, _messages.ModelResponse) return model_response.timestamp ``` #### output ```python output: OutputDataT ``` The output data from the agent run. #### all_messages ```python all_messages( *, output_tool_return_content: str | None = None ) -> list[ModelMessage] ``` Return the history of \_messages. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `output_tool_return_content` | `str | None` | The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If None, the last message will not be modified. | `None` | Returns: | Type | Description | | --- | --- | | `list[ModelMessage]` | List of messages. | Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python def all_messages(self, *, output_tool_return_content: str | None = None) -> list[_messages.ModelMessage]: """Return the history of _messages. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: List of messages. """ if output_tool_return_content is not None: return self._set_output_tool_return(output_tool_return_content) else: return self._state.message_history ``` #### all_messages_json ```python all_messages_json( *, output_tool_return_content: str | None = None ) -> bytes ``` Return all messages from all_messages as JSON bytes. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `output_tool_return_content` | `str | None` | The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If None, the last message will not be modified. | `None` | Returns: | Type | Description | | --- | --- | | `bytes` | JSON bytes representing the messages. | Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python def all_messages_json(self, *, output_tool_return_content: str | None = None) -> bytes: """Return all messages from [`all_messages`][pydantic_ai.agent.AgentRunResult.all_messages] as JSON bytes. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: JSON bytes representing the messages. """ return _messages.ModelMessagesTypeAdapter.dump_json( self.all_messages(output_tool_return_content=output_tool_return_content) ) ``` #### new_messages ```python new_messages( *, output_tool_return_content: str | None = None ) -> list[ModelMessage] ``` Return new messages associated with this run. Messages from older runs are excluded. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `output_tool_return_content` | `str | None` | The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If None, the last message will not be modified. | `None` | Returns: | Type | Description | | --- | --- | | `list[ModelMessage]` | List of new messages. | Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python def new_messages(self, *, output_tool_return_content: str | None = None) -> list[_messages.ModelMessage]: """Return new messages associated with this run. Messages from older runs are excluded. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: List of new messages. """ return self.all_messages(output_tool_return_content=output_tool_return_content)[self._new_message_index :] ``` #### new_messages_json ```python new_messages_json( *, output_tool_return_content: str | None = None ) -> bytes ``` Return new messages from new_messages as JSON bytes. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `output_tool_return_content` | `str | None` | The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If None, the last message will not be modified. | `None` | Returns: | Type | Description | | --- | --- | | `bytes` | JSON bytes representing the new messages. | Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python def new_messages_json(self, *, output_tool_return_content: str | None = None) -> bytes: """Return new messages from [`new_messages`][pydantic_ai.agent.AgentRunResult.new_messages] as JSON bytes. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: JSON bytes representing the new messages. """ return _messages.ModelMessagesTypeAdapter.dump_json( self.new_messages(output_tool_return_content=output_tool_return_content) ) ``` #### usage ```python usage() -> RunUsage ``` Return the usage of the whole run. Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python def usage(self) -> _usage.RunUsage: """Return the usage of the whole run.""" return self._state.usage ``` #### timestamp ```python timestamp() -> datetime ``` Return the timestamp of last response. Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python def timestamp(self) -> datetime: """Return the timestamp of last response.""" model_response = self.all_messages()[-1] assert isinstance(model_response, _messages.ModelResponse) return model_response.timestamp ``` ### EndStrategy ```python EndStrategy = Literal['early', 'exhaustive'] ``` The strategy for handling multiple tool calls when a final result is found. - `'early'`: Stop processing other tool calls once a final result is found - `'exhaustive'`: Process all tool calls even after finding a final result ### RunOutputDataT ```python RunOutputDataT = TypeVar('RunOutputDataT') ``` Type variable for the result data of a run where `output_type` was customized on the run call. ### capture_run_messages ```python capture_run_messages() -> Iterator[list[ModelMessage]] ``` Context manager to access the messages used in a run, run_sync, or run_stream call. Useful when a run may raise an exception, see [model errors](../../agents/#model-errors) for more information. Examples: ```python from pydantic_ai import Agent, capture_run_messages agent = Agent('test') with capture_run_messages() as messages: try: result = agent.run_sync('foobar') except Exception: print(messages) raise ``` Note If you call `run`, `run_sync`, or `run_stream` more than once within a single `capture_run_messages` context, `messages` will represent the messages exchanged during the first call only. Source code in `pydantic_ai_slim/pydantic_ai/_agent_graph.py` ````python @contextmanager def capture_run_messages() -> Iterator[list[_messages.ModelMessage]]: """Context manager to access the messages used in a [`run`][pydantic_ai.agent.AbstractAgent.run], [`run_sync`][pydantic_ai.agent.AbstractAgent.run_sync], or [`run_stream`][pydantic_ai.agent.AbstractAgent.run_stream] call. Useful when a run may raise an exception, see [model errors](../agents.md#model-errors) for more information. Examples: ```python from pydantic_ai import Agent, capture_run_messages agent = Agent('test') with capture_run_messages() as messages: try: result = agent.run_sync('foobar') except Exception: print(messages) raise ``` !!! note If you call `run`, `run_sync`, or `run_stream` more than once within a single `capture_run_messages` context, `messages` will represent the messages exchanged during the first call only. """ token = None messages: list[_messages.ModelMessage] = [] # Try to reuse existing message context if available try: messages = _messages_ctx_var.get().messages except LookupError: # No existing context, create a new one token = _messages_ctx_var.set(_RunMessages(messages)) try: yield messages finally: # Clean up context if we created it if token is not None: _messages_ctx_var.reset(token) ```` ### InstrumentationSettings Options for instrumenting models and agents with OpenTelemetry. Used in: - `Agent(instrument=...)` - Agent.instrument_all() - InstrumentedModel See the [Debugging and Monitoring guide](https://ai.pydantic.dev/logfire/) for more info. Source code in `pydantic_ai_slim/pydantic_ai/models/instrumented.py` ```python @dataclass(init=False) class InstrumentationSettings: """Options for instrumenting models and agents with OpenTelemetry. Used in: - `Agent(instrument=...)` - [`Agent.instrument_all()`][pydantic_ai.agent.Agent.instrument_all] - [`InstrumentedModel`][pydantic_ai.models.instrumented.InstrumentedModel] See the [Debugging and Monitoring guide](https://ai.pydantic.dev/logfire/) for more info. """ tracer: Tracer = field(repr=False) event_logger: EventLogger = field(repr=False) event_mode: Literal['attributes', 'logs'] = 'attributes' include_binary_content: bool = True include_content: bool = True version: Literal[1, 2] = 1 def __init__( self, *, tracer_provider: TracerProvider | None = None, meter_provider: MeterProvider | None = None, include_binary_content: bool = True, include_content: bool = True, version: Literal[1, 2] = 2, event_mode: Literal['attributes', 'logs'] = 'attributes', event_logger_provider: EventLoggerProvider | None = None, ): """Create instrumentation options. Args: tracer_provider: The OpenTelemetry tracer provider to use. If not provided, the global tracer provider is used. Calling `logfire.configure()` sets the global tracer provider, so most users don't need this. meter_provider: The OpenTelemetry meter provider to use. If not provided, the global meter provider is used. Calling `logfire.configure()` sets the global meter provider, so most users don't need this. include_binary_content: Whether to include binary content in the instrumentation events. include_content: Whether to include prompts, completions, and tool call arguments and responses in the instrumentation events. version: Version of the data format. This is unrelated to the Pydantic AI package version. Version 1 is based on the legacy event-based OpenTelemetry GenAI spec and will be removed in a future release. The parameters `event_mode` and `event_logger_provider` are only relevant for version 1. Version 2 uses the newer OpenTelemetry GenAI spec and stores messages in the following attributes: - `gen_ai.system_instructions` for instructions passed to the agent. - `gen_ai.input.messages` and `gen_ai.output.messages` on model request spans. - `pydantic_ai.all_messages` on agent run spans. event_mode: The mode for emitting events in version 1. If `'attributes'`, events are attached to the span as attributes. If `'logs'`, events are emitted as OpenTelemetry log-based events. event_logger_provider: The OpenTelemetry event logger provider to use. If not provided, the global event logger provider is used. Calling `logfire.configure()` sets the global event logger provider, so most users don't need this. This is only used if `event_mode='logs'` and `version=1`. """ from pydantic_ai import __version__ tracer_provider = tracer_provider or get_tracer_provider() meter_provider = meter_provider or get_meter_provider() event_logger_provider = event_logger_provider or get_event_logger_provider() scope_name = 'pydantic-ai' self.tracer = tracer_provider.get_tracer(scope_name, __version__) self.meter = meter_provider.get_meter(scope_name, __version__) self.event_logger = event_logger_provider.get_event_logger(scope_name, __version__) self.event_mode = event_mode self.include_binary_content = include_binary_content self.include_content = include_content if event_mode == 'logs' and version != 1: warnings.warn( 'event_mode is only relevant for version=1 which is deprecated and will be removed in a future release.', stacklevel=2, ) version = 1 self.version = version # As specified in the OpenTelemetry GenAI metrics spec: # https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-metrics/#metric-gen_aiclienttokenusage tokens_histogram_kwargs = dict( name='gen_ai.client.token.usage', unit='{token}', description='Measures number of input and output tokens used', ) try: self.tokens_histogram = self.meter.create_histogram( **tokens_histogram_kwargs, explicit_bucket_boundaries_advisory=TOKEN_HISTOGRAM_BOUNDARIES, ) except TypeError: # pragma: lax no cover # Older OTel/logfire versions don't support explicit_bucket_boundaries_advisory self.tokens_histogram = self.meter.create_histogram( **tokens_histogram_kwargs, # pyright: ignore ) def messages_to_otel_events(self, messages: list[ModelMessage]) -> list[Event]: """Convert a list of model messages to OpenTelemetry events. Args: messages: The messages to convert. Returns: A list of OpenTelemetry events. """ events: list[Event] = [] instructions = InstrumentedModel._get_instructions(messages) # pyright: ignore [reportPrivateUsage] if instructions is not None: events.append( Event( 'gen_ai.system.message', body={**({'content': instructions} if self.include_content else {}), 'role': 'system'}, ) ) for message_index, message in enumerate(messages): message_events: list[Event] = [] if isinstance(message, ModelRequest): for part in message.parts: if hasattr(part, 'otel_event'): message_events.append(part.otel_event(self)) elif isinstance(message, ModelResponse): # pragma: no branch message_events = message.otel_events(self) for event in message_events: event.attributes = { 'gen_ai.message.index': message_index, **(event.attributes or {}), } events.extend(message_events) for event in events: event.body = InstrumentedModel.serialize_any(event.body) return events def messages_to_otel_messages(self, messages: list[ModelMessage]) -> list[_otel_messages.ChatMessage]: result: list[_otel_messages.ChatMessage] = [] for message in messages: if isinstance(message, ModelRequest): for is_system, group in itertools.groupby(message.parts, key=lambda p: isinstance(p, SystemPromptPart)): message_parts: list[_otel_messages.MessagePart] = [] for part in group: if hasattr(part, 'otel_message_parts'): message_parts.extend(part.otel_message_parts(self)) result.append( _otel_messages.ChatMessage(role='system' if is_system else 'user', parts=message_parts) ) elif isinstance(message, ModelResponse): # pragma: no branch otel_message = _otel_messages.OutputMessage(role='assistant', parts=message.otel_message_parts(self)) if message.finish_reason is not None: otel_message['finish_reason'] = message.finish_reason result.append(otel_message) return result def handle_messages(self, input_messages: list[ModelMessage], response: ModelResponse, system: str, span: Span): if self.version == 1: events = self.messages_to_otel_events(input_messages) for event in self.messages_to_otel_events([response]): events.append( Event( 'gen_ai.choice', body={ 'index': 0, 'message': event.body, }, ) ) for event in events: event.attributes = { GEN_AI_SYSTEM_ATTRIBUTE: system, **(event.attributes or {}), } self._emit_events(span, events) else: output_messages = self.messages_to_otel_messages([response]) assert len(output_messages) == 1 output_message = output_messages[0] instructions = InstrumentedModel._get_instructions(input_messages) # pyright: ignore [reportPrivateUsage] system_instructions_attributes = self.system_instructions_attributes(instructions) attributes: dict[str, AttributeValue] = { 'gen_ai.input.messages': json.dumps(self.messages_to_otel_messages(input_messages)), 'gen_ai.output.messages': json.dumps([output_message]), **system_instructions_attributes, 'logfire.json_schema': json.dumps( { 'type': 'object', 'properties': { 'gen_ai.input.messages': {'type': 'array'}, 'gen_ai.output.messages': {'type': 'array'}, **( {'gen_ai.system_instructions': {'type': 'array'}} if system_instructions_attributes else {} ), 'model_request_parameters': {'type': 'object'}, }, } ), } span.set_attributes(attributes) def system_instructions_attributes(self, instructions: str | None) -> dict[str, str]: if instructions and self.include_content: return { 'gen_ai.system_instructions': json.dumps([_otel_messages.TextPart(type='text', content=instructions)]), } return {} def _emit_events(self, span: Span, events: list[Event]) -> None: if self.event_mode == 'logs': for event in events: self.event_logger.emit(event) else: attr_name = 'events' span.set_attributes( { attr_name: json.dumps([InstrumentedModel.event_to_dict(event) for event in events]), 'logfire.json_schema': json.dumps( { 'type': 'object', 'properties': { attr_name: {'type': 'array'}, 'model_request_parameters': {'type': 'object'}, }, } ), } ) ``` #### __init__ ```python __init__( *, tracer_provider: TracerProvider | None = None, meter_provider: MeterProvider | None = None, include_binary_content: bool = True, include_content: bool = True, version: Literal[1, 2] = 2, event_mode: Literal[ "attributes", "logs" ] = "attributes", event_logger_provider: EventLoggerProvider | None = None ) ``` Create instrumentation options. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `tracer_provider` | `TracerProvider | None` | The OpenTelemetry tracer provider to use. If not provided, the global tracer provider is used. Calling logfire.configure() sets the global tracer provider, so most users don't need this. | `None` | | `meter_provider` | `MeterProvider | None` | The OpenTelemetry meter provider to use. If not provided, the global meter provider is used. Calling logfire.configure() sets the global meter provider, so most users don't need this. | `None` | | `include_binary_content` | `bool` | Whether to include binary content in the instrumentation events. | `True` | | `include_content` | `bool` | Whether to include prompts, completions, and tool call arguments and responses in the instrumentation events. | `True` | | `version` | `Literal[1, 2]` | Version of the data format. This is unrelated to the Pydantic AI package version. Version 1 is based on the legacy event-based OpenTelemetry GenAI spec and will be removed in a future release. The parameters event_mode and event_logger_provider are only relevant for version 1. Version 2 uses the newer OpenTelemetry GenAI spec and stores messages in the following attributes: - gen_ai.system_instructions for instructions passed to the agent. - gen_ai.input.messages and gen_ai.output.messages on model request spans. - pydantic_ai.all_messages on agent run spans. | `2` | | `event_mode` | `Literal['attributes', 'logs']` | The mode for emitting events in version 1. If 'attributes', events are attached to the span as attributes. If 'logs', events are emitted as OpenTelemetry log-based events. | `'attributes'` | | `event_logger_provider` | `EventLoggerProvider | None` | The OpenTelemetry event logger provider to use. If not provided, the global event logger provider is used. Calling logfire.configure() sets the global event logger provider, so most users don't need this. This is only used if event_mode='logs' and version=1. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/models/instrumented.py` ```python def __init__( self, *, tracer_provider: TracerProvider | None = None, meter_provider: MeterProvider | None = None, include_binary_content: bool = True, include_content: bool = True, version: Literal[1, 2] = 2, event_mode: Literal['attributes', 'logs'] = 'attributes', event_logger_provider: EventLoggerProvider | None = None, ): """Create instrumentation options. Args: tracer_provider: The OpenTelemetry tracer provider to use. If not provided, the global tracer provider is used. Calling `logfire.configure()` sets the global tracer provider, so most users don't need this. meter_provider: The OpenTelemetry meter provider to use. If not provided, the global meter provider is used. Calling `logfire.configure()` sets the global meter provider, so most users don't need this. include_binary_content: Whether to include binary content in the instrumentation events. include_content: Whether to include prompts, completions, and tool call arguments and responses in the instrumentation events. version: Version of the data format. This is unrelated to the Pydantic AI package version. Version 1 is based on the legacy event-based OpenTelemetry GenAI spec and will be removed in a future release. The parameters `event_mode` and `event_logger_provider` are only relevant for version 1. Version 2 uses the newer OpenTelemetry GenAI spec and stores messages in the following attributes: - `gen_ai.system_instructions` for instructions passed to the agent. - `gen_ai.input.messages` and `gen_ai.output.messages` on model request spans. - `pydantic_ai.all_messages` on agent run spans. event_mode: The mode for emitting events in version 1. If `'attributes'`, events are attached to the span as attributes. If `'logs'`, events are emitted as OpenTelemetry log-based events. event_logger_provider: The OpenTelemetry event logger provider to use. If not provided, the global event logger provider is used. Calling `logfire.configure()` sets the global event logger provider, so most users don't need this. This is only used if `event_mode='logs'` and `version=1`. """ from pydantic_ai import __version__ tracer_provider = tracer_provider or get_tracer_provider() meter_provider = meter_provider or get_meter_provider() event_logger_provider = event_logger_provider or get_event_logger_provider() scope_name = 'pydantic-ai' self.tracer = tracer_provider.get_tracer(scope_name, __version__) self.meter = meter_provider.get_meter(scope_name, __version__) self.event_logger = event_logger_provider.get_event_logger(scope_name, __version__) self.event_mode = event_mode self.include_binary_content = include_binary_content self.include_content = include_content if event_mode == 'logs' and version != 1: warnings.warn( 'event_mode is only relevant for version=1 which is deprecated and will be removed in a future release.', stacklevel=2, ) version = 1 self.version = version # As specified in the OpenTelemetry GenAI metrics spec: # https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-metrics/#metric-gen_aiclienttokenusage tokens_histogram_kwargs = dict( name='gen_ai.client.token.usage', unit='{token}', description='Measures number of input and output tokens used', ) try: self.tokens_histogram = self.meter.create_histogram( **tokens_histogram_kwargs, explicit_bucket_boundaries_advisory=TOKEN_HISTOGRAM_BOUNDARIES, ) except TypeError: # pragma: lax no cover # Older OTel/logfire versions don't support explicit_bucket_boundaries_advisory self.tokens_histogram = self.meter.create_histogram( **tokens_histogram_kwargs, # pyright: ignore ) ``` #### messages_to_otel_events ```python messages_to_otel_events( messages: list[ModelMessage], ) -> list[Event] ``` Convert a list of model messages to OpenTelemetry events. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `messages` | `list[ModelMessage]` | The messages to convert. | *required* | Returns: | Type | Description | | --- | --- | | `list[Event]` | A list of OpenTelemetry events. | Source code in `pydantic_ai_slim/pydantic_ai/models/instrumented.py` ```python def messages_to_otel_events(self, messages: list[ModelMessage]) -> list[Event]: """Convert a list of model messages to OpenTelemetry events. Args: messages: The messages to convert. Returns: A list of OpenTelemetry events. """ events: list[Event] = [] instructions = InstrumentedModel._get_instructions(messages) # pyright: ignore [reportPrivateUsage] if instructions is not None: events.append( Event( 'gen_ai.system.message', body={**({'content': instructions} if self.include_content else {}), 'role': 'system'}, ) ) for message_index, message in enumerate(messages): message_events: list[Event] = [] if isinstance(message, ModelRequest): for part in message.parts: if hasattr(part, 'otel_event'): message_events.append(part.otel_event(self)) elif isinstance(message, ModelResponse): # pragma: no branch message_events = message.otel_events(self) for event in message_events: event.attributes = { 'gen_ai.message.index': message_index, **(event.attributes or {}), } events.extend(message_events) for event in events: event.body = InstrumentedModel.serialize_any(event.body) return events ``` ### EventStreamHandler ```python EventStreamHandler: TypeAlias = Callable[ [ RunContext[AgentDepsT], AsyncIterable[AgentStreamEvent], ], Awaitable[None], ] ``` A function that receives agent RunContext and an async iterable of events from the model's streaming response and the agent's execution of tools. # `pydantic_ai.builtin_tools` ### AbstractBuiltinTool Bases: `ABC` A builtin tool that can be used by an agent. This class is abstract and cannot be instantiated directly. The builtin tools are passed to the model as part of the `ModelRequestParameters`. Source code in `pydantic_ai_slim/pydantic_ai/builtin_tools.py` ```python @dataclass(kw_only=True) class AbstractBuiltinTool(ABC): """A builtin tool that can be used by an agent. This class is abstract and cannot be instantiated directly. The builtin tools are passed to the model as part of the `ModelRequestParameters`. """ ``` ### WebSearchTool Bases: `AbstractBuiltinTool` A builtin tool that allows your agent to search the web for information. The parameters that PydanticAI passes depend on the model, as some parameters may not be supported by certain models. Supported by: - Anthropic - OpenAI Responses - Groq - Google Source code in `pydantic_ai_slim/pydantic_ai/builtin_tools.py` ```python @dataclass(kw_only=True) class WebSearchTool(AbstractBuiltinTool): """A builtin tool that allows your agent to search the web for information. The parameters that PydanticAI passes depend on the model, as some parameters may not be supported by certain models. Supported by: * Anthropic * OpenAI Responses * Groq * Google """ search_context_size: Literal['low', 'medium', 'high'] = 'medium' """The `search_context_size` parameter controls how much context is retrieved from the web to help the tool formulate a response. Supported by: * OpenAI Responses """ user_location: WebSearchUserLocation | None = None """The `user_location` parameter allows you to localize search results based on a user's location. Supported by: * Anthropic * OpenAI Responses """ blocked_domains: list[str] | None = None """If provided, these domains will never appear in results. With Anthropic, you can only use one of `blocked_domains` or `allowed_domains`, not both. Supported by: * Anthropic, see * Groq, see """ allowed_domains: list[str] | None = None """If provided, only these domains will be included in results. With Anthropic, you can only use one of `blocked_domains` or `allowed_domains`, not both. Supported by: * Anthropic, see * Groq, see """ max_uses: int | None = None """If provided, the tool will stop searching the web after the given number of uses. Supported by: * Anthropic """ ``` #### search_context_size ```python search_context_size: Literal["low", "medium", "high"] = ( "medium" ) ``` The `search_context_size` parameter controls how much context is retrieved from the web to help the tool formulate a response. Supported by: - OpenAI Responses #### user_location ```python user_location: WebSearchUserLocation | None = None ``` The `user_location` parameter allows you to localize search results based on a user's location. Supported by: - Anthropic - OpenAI Responses #### blocked_domains ```python blocked_domains: list[str] | None = None ``` If provided, these domains will never appear in results. With Anthropic, you can only use one of `blocked_domains` or `allowed_domains`, not both. Supported by: - Anthropic, see - Groq, see #### allowed_domains ```python allowed_domains: list[str] | None = None ``` If provided, only these domains will be included in results. With Anthropic, you can only use one of `blocked_domains` or `allowed_domains`, not both. Supported by: - Anthropic, see - Groq, see #### max_uses ```python max_uses: int | None = None ``` If provided, the tool will stop searching the web after the given number of uses. Supported by: - Anthropic ### WebSearchUserLocation Bases: `TypedDict` Allows you to localize search results based on a user's location. Supported by: - Anthropic - OpenAI Responses Source code in `pydantic_ai_slim/pydantic_ai/builtin_tools.py` ```python class WebSearchUserLocation(TypedDict, total=False): """Allows you to localize search results based on a user's location. Supported by: * Anthropic * OpenAI Responses """ city: str """The city where the user is located.""" country: str """The country where the user is located. For OpenAI, this must be a 2-letter country code (e.g., 'US', 'GB').""" region: str """The region or state where the user is located.""" timezone: str """The timezone of the user's location.""" ``` #### city ```python city: str ``` The city where the user is located. #### country ```python country: str ``` The country where the user is located. For OpenAI, this must be a 2-letter country code (e.g., 'US', 'GB'). #### region ```python region: str ``` The region or state where the user is located. #### timezone ```python timezone: str ``` The timezone of the user's location. ### CodeExecutionTool Bases: `AbstractBuiltinTool` A builtin tool that allows your agent to execute code. Supported by: - Anthropic - OpenAI Responses - Google Source code in `pydantic_ai_slim/pydantic_ai/builtin_tools.py` ```python class CodeExecutionTool(AbstractBuiltinTool): """A builtin tool that allows your agent to execute code. Supported by: * Anthropic * OpenAI Responses * Google """ ``` ### UrlContextTool Bases: `AbstractBuiltinTool` Allows your agent to access contents from URLs. Supported by: - Google Source code in `pydantic_ai_slim/pydantic_ai/builtin_tools.py` ```python class UrlContextTool(AbstractBuiltinTool): """Allows your agent to access contents from URLs. Supported by: * Google """ ``` # `pydantic_ai.common_tools` ### duckduckgo_search_tool ```python duckduckgo_search_tool( duckduckgo_client: DDGS | None = None, max_results: int | None = None, ) ``` Creates a DuckDuckGo search tool. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `duckduckgo_client` | `DDGS | None` | The DuckDuckGo search client. | `None` | | `max_results` | `int | None` | The maximum number of results. If None, returns results only from the first response. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/common_tools/duckduckgo.py` ```python def duckduckgo_search_tool(duckduckgo_client: DDGS | None = None, max_results: int | None = None): """Creates a DuckDuckGo search tool. Args: duckduckgo_client: The DuckDuckGo search client. max_results: The maximum number of results. If None, returns results only from the first response. """ return Tool( DuckDuckGoSearchTool(client=duckduckgo_client or DDGS(), max_results=max_results).__call__, name='duckduckgo_search', description='Searches DuckDuckGo for the given query and returns the results.', ) ``` ### tavily_search_tool ```python tavily_search_tool(api_key: str) ``` Creates a Tavily search tool. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `api_key` | `str` | The Tavily API key. You can get one by signing up at https://app.tavily.com/home. | *required* | Source code in `pydantic_ai_slim/pydantic_ai/common_tools/tavily.py` ```python def tavily_search_tool(api_key: str): """Creates a Tavily search tool. Args: api_key: The Tavily API key. You can get one by signing up at [https://app.tavily.com/home](https://app.tavily.com/home). """ return Tool( TavilySearchTool(client=AsyncTavilyClient(api_key)).__call__, name='tavily_search', description='Searches Tavily for the given query and returns the results.', ) ``` # `pydantic_ai.direct` Methods for making imperative requests to language models with minimal abstraction. These methods allow you to make requests to LLMs where the only abstraction is input and output schema translation so you can use all models with the same API. These methods are thin wrappers around Model implementations. ### model_request ```python model_request( model: Model | KnownModelName | str, messages: list[ModelMessage], *, model_settings: ModelSettings | None = None, model_request_parameters: ( ModelRequestParameters | None ) = None, instrument: InstrumentationSettings | bool | None = None ) -> ModelResponse ``` Make a non-streamed request to a model. model_request_example.py ```py from pydantic_ai.direct import model_request from pydantic_ai.messages import ModelRequest async def main(): model_response = await model_request( 'anthropic:claude-3-5-haiku-latest', [ModelRequest.user_text_prompt('What is the capital of France?')] # (1)! ) print(model_response) ''' ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='claude-3-5-haiku-latest', timestamp=datetime.datetime(...), ) ''' ``` 1. See ModelRequest.user_text_prompt for details. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `model` | `Model | KnownModelName | str` | The model to make a request to. We allow str here since the actual list of allowed models changes frequently. | *required* | | `messages` | `list[ModelMessage]` | Messages to send to the model | *required* | | `model_settings` | `ModelSettings | None` | optional model settings | `None` | | `model_request_parameters` | `ModelRequestParameters | None` | optional model request parameters | `None` | | `instrument` | `InstrumentationSettings | bool | None` | Whether to instrument the request with OpenTelemetry/Logfire, if None the value from logfire.instrument_pydantic_ai is used. | `None` | Returns: | Type | Description | | --- | --- | | `ModelResponse` | The model response and token usage associated with the request. | Source code in `pydantic_ai_slim/pydantic_ai/direct.py` ````python async def model_request( model: models.Model | models.KnownModelName | str, messages: list[messages.ModelMessage], *, model_settings: settings.ModelSettings | None = None, model_request_parameters: models.ModelRequestParameters | None = None, instrument: instrumented_models.InstrumentationSettings | bool | None = None, ) -> messages.ModelResponse: """Make a non-streamed request to a model. ```py title="model_request_example.py" from pydantic_ai.direct import model_request from pydantic_ai.messages import ModelRequest async def main(): model_response = await model_request( 'anthropic:claude-3-5-haiku-latest', [ModelRequest.user_text_prompt('What is the capital of France?')] # (1)! ) print(model_response) ''' ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='claude-3-5-haiku-latest', timestamp=datetime.datetime(...), ) ''' ``` 1. See [`ModelRequest.user_text_prompt`][pydantic_ai.messages.ModelRequest.user_text_prompt] for details. Args: model: The model to make a request to. We allow `str` here since the actual list of allowed models changes frequently. messages: Messages to send to the model model_settings: optional model settings model_request_parameters: optional model request parameters instrument: Whether to instrument the request with OpenTelemetry/Logfire, if `None` the value from [`logfire.instrument_pydantic_ai`][logfire.Logfire.instrument_pydantic_ai] is used. Returns: The model response and token usage associated with the request. """ model_instance = _prepare_model(model, instrument) return await model_instance.request( messages, model_settings, model_instance.customize_request_parameters(model_request_parameters or models.ModelRequestParameters()), ) ```` ### model_request_sync ```python model_request_sync( model: Model | KnownModelName | str, messages: list[ModelMessage], *, model_settings: ModelSettings | None = None, model_request_parameters: ( ModelRequestParameters | None ) = None, instrument: InstrumentationSettings | bool | None = None ) -> ModelResponse ``` Make a Synchronous, non-streamed request to a model. This is a convenience method that wraps model_request with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. model_request_sync_example.py ```py from pydantic_ai.direct import model_request_sync from pydantic_ai.messages import ModelRequest model_response = model_request_sync( 'anthropic:claude-3-5-haiku-latest', [ModelRequest.user_text_prompt('What is the capital of France?')] # (1)! ) print(model_response) ''' ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='claude-3-5-haiku-latest', timestamp=datetime.datetime(...), ) ''' ``` 1. See ModelRequest.user_text_prompt for details. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `model` | `Model | KnownModelName | str` | The model to make a request to. We allow str here since the actual list of allowed models changes frequently. | *required* | | `messages` | `list[ModelMessage]` | Messages to send to the model | *required* | | `model_settings` | `ModelSettings | None` | optional model settings | `None` | | `model_request_parameters` | `ModelRequestParameters | None` | optional model request parameters | `None` | | `instrument` | `InstrumentationSettings | bool | None` | Whether to instrument the request with OpenTelemetry/Logfire, if None the value from logfire.instrument_pydantic_ai is used. | `None` | Returns: | Type | Description | | --- | --- | | `ModelResponse` | The model response and token usage associated with the request. | Source code in `pydantic_ai_slim/pydantic_ai/direct.py` ````python def model_request_sync( model: models.Model | models.KnownModelName | str, messages: list[messages.ModelMessage], *, model_settings: settings.ModelSettings | None = None, model_request_parameters: models.ModelRequestParameters | None = None, instrument: instrumented_models.InstrumentationSettings | bool | None = None, ) -> messages.ModelResponse: """Make a Synchronous, non-streamed request to a model. This is a convenience method that wraps [`model_request`][pydantic_ai.direct.model_request] with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. ```py title="model_request_sync_example.py" from pydantic_ai.direct import model_request_sync from pydantic_ai.messages import ModelRequest model_response = model_request_sync( 'anthropic:claude-3-5-haiku-latest', [ModelRequest.user_text_prompt('What is the capital of France?')] # (1)! ) print(model_response) ''' ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='claude-3-5-haiku-latest', timestamp=datetime.datetime(...), ) ''' ``` 1. See [`ModelRequest.user_text_prompt`][pydantic_ai.messages.ModelRequest.user_text_prompt] for details. Args: model: The model to make a request to. We allow `str` here since the actual list of allowed models changes frequently. messages: Messages to send to the model model_settings: optional model settings model_request_parameters: optional model request parameters instrument: Whether to instrument the request with OpenTelemetry/Logfire, if `None` the value from [`logfire.instrument_pydantic_ai`][logfire.Logfire.instrument_pydantic_ai] is used. Returns: The model response and token usage associated with the request. """ return _get_event_loop().run_until_complete( model_request( model, messages, model_settings=model_settings, model_request_parameters=model_request_parameters, instrument=instrument, ) ) ```` ### model_request_stream ```python model_request_stream( model: Model | KnownModelName | str, messages: list[ModelMessage], *, model_settings: ModelSettings | None = None, model_request_parameters: ( ModelRequestParameters | None ) = None, instrument: InstrumentationSettings | bool | None = None ) -> AbstractAsyncContextManager[StreamedResponse] ``` Make a streamed async request to a model. model_request_stream_example.py ```py from pydantic_ai.direct import model_request_stream from pydantic_ai.messages import ModelRequest async def main(): messages = [ModelRequest.user_text_prompt('Who was Albert Einstein?')] # (1)! async with model_request_stream('openai:gpt-4.1-mini', messages) as stream: chunks = [] async for chunk in stream: chunks.append(chunk) print(chunks) ''' [ PartStartEvent(index=0, part=TextPart(content='Albert Einstein was ')), FinalResultEvent(tool_name=None, tool_call_id=None), PartDeltaEvent( index=0, delta=TextPartDelta(content_delta='a German-born theoretical ') ), PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='physicist.')), ] ''' ``` 1. See ModelRequest.user_text_prompt for details. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `model` | `Model | KnownModelName | str` | The model to make a request to. We allow str here since the actual list of allowed models changes frequently. | *required* | | `messages` | `list[ModelMessage]` | Messages to send to the model | *required* | | `model_settings` | `ModelSettings | None` | optional model settings | `None` | | `model_request_parameters` | `ModelRequestParameters | None` | optional model request parameters | `None` | | `instrument` | `InstrumentationSettings | bool | None` | Whether to instrument the request with OpenTelemetry/Logfire, if None the value from logfire.instrument_pydantic_ai is used. | `None` | Returns: | Type | Description | | --- | --- | | `AbstractAsyncContextManager[StreamedResponse]` | A stream response async context manager. | Source code in `pydantic_ai_slim/pydantic_ai/direct.py` ````python def model_request_stream( model: models.Model | models.KnownModelName | str, messages: list[messages.ModelMessage], *, model_settings: settings.ModelSettings | None = None, model_request_parameters: models.ModelRequestParameters | None = None, instrument: instrumented_models.InstrumentationSettings | bool | None = None, ) -> AbstractAsyncContextManager[models.StreamedResponse]: """Make a streamed async request to a model. ```py {title="model_request_stream_example.py"} from pydantic_ai.direct import model_request_stream from pydantic_ai.messages import ModelRequest async def main(): messages = [ModelRequest.user_text_prompt('Who was Albert Einstein?')] # (1)! async with model_request_stream('openai:gpt-4.1-mini', messages) as stream: chunks = [] async for chunk in stream: chunks.append(chunk) print(chunks) ''' [ PartStartEvent(index=0, part=TextPart(content='Albert Einstein was ')), FinalResultEvent(tool_name=None, tool_call_id=None), PartDeltaEvent( index=0, delta=TextPartDelta(content_delta='a German-born theoretical ') ), PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='physicist.')), ] ''' ``` 1. See [`ModelRequest.user_text_prompt`][pydantic_ai.messages.ModelRequest.user_text_prompt] for details. Args: model: The model to make a request to. We allow `str` here since the actual list of allowed models changes frequently. messages: Messages to send to the model model_settings: optional model settings model_request_parameters: optional model request parameters instrument: Whether to instrument the request with OpenTelemetry/Logfire, if `None` the value from [`logfire.instrument_pydantic_ai`][logfire.Logfire.instrument_pydantic_ai] is used. Returns: A [stream response][pydantic_ai.models.StreamedResponse] async context manager. """ model_instance = _prepare_model(model, instrument) return model_instance.request_stream( messages, model_settings, model_instance.customize_request_parameters(model_request_parameters or models.ModelRequestParameters()), ) ```` ### model_request_stream_sync ```python model_request_stream_sync( model: Model | KnownModelName | str, messages: list[ModelMessage], *, model_settings: ModelSettings | None = None, model_request_parameters: ( ModelRequestParameters | None ) = None, instrument: InstrumentationSettings | bool | None = None ) -> StreamedResponseSync ``` Make a streamed synchronous request to a model. This is the synchronous version of model_request_stream. It uses threading to run the asynchronous stream in the background while providing a synchronous iterator interface. model_request_stream_sync_example.py ```py from pydantic_ai.direct import model_request_stream_sync from pydantic_ai.messages import ModelRequest messages = [ModelRequest.user_text_prompt('Who was Albert Einstein?')] with model_request_stream_sync('openai:gpt-4.1-mini', messages) as stream: chunks = [] for chunk in stream: chunks.append(chunk) print(chunks) ''' [ PartStartEvent(index=0, part=TextPart(content='Albert Einstein was ')), FinalResultEvent(tool_name=None, tool_call_id=None), PartDeltaEvent( index=0, delta=TextPartDelta(content_delta='a German-born theoretical ') ), PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='physicist.')), ] ''' ``` Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `model` | `Model | KnownModelName | str` | The model to make a request to. We allow str here since the actual list of allowed models changes frequently. | *required* | | `messages` | `list[ModelMessage]` | Messages to send to the model | *required* | | `model_settings` | `ModelSettings | None` | optional model settings | `None` | | `model_request_parameters` | `ModelRequestParameters | None` | optional model request parameters | `None` | | `instrument` | `InstrumentationSettings | bool | None` | Whether to instrument the request with OpenTelemetry/Logfire, if None the value from logfire.instrument_pydantic_ai is used. | `None` | Returns: | Type | Description | | --- | --- | | `StreamedResponseSync` | A sync stream response context manager. | Source code in `pydantic_ai_slim/pydantic_ai/direct.py` ````python def model_request_stream_sync( model: models.Model | models.KnownModelName | str, messages: list[messages.ModelMessage], *, model_settings: settings.ModelSettings | None = None, model_request_parameters: models.ModelRequestParameters | None = None, instrument: instrumented_models.InstrumentationSettings | bool | None = None, ) -> StreamedResponseSync: """Make a streamed synchronous request to a model. This is the synchronous version of [`model_request_stream`][pydantic_ai.direct.model_request_stream]. It uses threading to run the asynchronous stream in the background while providing a synchronous iterator interface. ```py {title="model_request_stream_sync_example.py"} from pydantic_ai.direct import model_request_stream_sync from pydantic_ai.messages import ModelRequest messages = [ModelRequest.user_text_prompt('Who was Albert Einstein?')] with model_request_stream_sync('openai:gpt-4.1-mini', messages) as stream: chunks = [] for chunk in stream: chunks.append(chunk) print(chunks) ''' [ PartStartEvent(index=0, part=TextPart(content='Albert Einstein was ')), FinalResultEvent(tool_name=None, tool_call_id=None), PartDeltaEvent( index=0, delta=TextPartDelta(content_delta='a German-born theoretical ') ), PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='physicist.')), ] ''' ``` Args: model: The model to make a request to. We allow `str` here since the actual list of allowed models changes frequently. messages: Messages to send to the model model_settings: optional model settings model_request_parameters: optional model request parameters instrument: Whether to instrument the request with OpenTelemetry/Logfire, if `None` the value from [`logfire.instrument_pydantic_ai`][logfire.Logfire.instrument_pydantic_ai] is used. Returns: A [sync stream response][pydantic_ai.direct.StreamedResponseSync] context manager. """ async_stream_cm = model_request_stream( model=model, messages=messages, model_settings=model_settings, model_request_parameters=model_request_parameters, instrument=instrument, ) return StreamedResponseSync(async_stream_cm) ```` ### StreamedResponseSync Synchronous wrapper to async streaming responses by running the async producer in a background thread and providing a synchronous iterator. This class must be used as a context manager with the `with` statement. Source code in `pydantic_ai_slim/pydantic_ai/direct.py` ```python @dataclass class StreamedResponseSync: """Synchronous wrapper to async streaming responses by running the async producer in a background thread and providing a synchronous iterator. This class must be used as a context manager with the `with` statement. """ _async_stream_cm: AbstractAsyncContextManager[StreamedResponse] _queue: queue.Queue[messages.ModelResponseStreamEvent | Exception | None] = field( default_factory=queue.Queue, init=False ) _thread: threading.Thread | None = field(default=None, init=False) _stream_response: StreamedResponse | None = field(default=None, init=False) _exception: Exception | None = field(default=None, init=False) _context_entered: bool = field(default=False, init=False) _stream_ready: threading.Event = field(default_factory=threading.Event, init=False) def __enter__(self) -> StreamedResponseSync: self._context_entered = True self._start_producer() return self def __exit__( self, _exc_type: type[BaseException] | None, _exc_val: BaseException | None, _exc_tb: TracebackType | None, ) -> None: self._cleanup() def __iter__(self) -> Iterator[messages.ModelResponseStreamEvent]: """Stream the response as an iterable of [`ModelResponseStreamEvent`][pydantic_ai.messages.ModelResponseStreamEvent]s.""" self._check_context_manager_usage() while True: item = self._queue.get() if item is None: # End of stream break elif isinstance(item, Exception): raise item else: yield item def __repr__(self) -> str: if self._stream_response: return repr(self._stream_response) else: return f'{self.__class__.__name__}(context_entered={self._context_entered})' __str__ = __repr__ def _check_context_manager_usage(self) -> None: if not self._context_entered: raise RuntimeError( 'StreamedResponseSync must be used as a context manager. ' 'Use: `with model_request_stream_sync(...) as stream:`' ) def _ensure_stream_ready(self) -> StreamedResponse: self._check_context_manager_usage() if self._stream_response is None: # Wait for the background thread to signal that the stream is ready if not self._stream_ready.wait(timeout=STREAM_INITIALIZATION_TIMEOUT): raise RuntimeError('Stream failed to initialize within timeout') if self._stream_response is None: # pragma: no cover raise RuntimeError('Stream failed to initialize') return self._stream_response def _start_producer(self): self._thread = threading.Thread(target=self._async_producer, daemon=True) self._thread.start() def _async_producer(self): async def _consume_async_stream(): try: async with self._async_stream_cm as stream: self._stream_response = stream # Signal that the stream is ready self._stream_ready.set() async for event in stream: self._queue.put(event) except Exception as e: # Signal ready even on error so waiting threads don't hang self._stream_ready.set() self._queue.put(e) finally: self._queue.put(None) # Signal end _get_event_loop().run_until_complete(_consume_async_stream()) def _cleanup(self): if self._thread and self._thread.is_alive(): self._thread.join() def get(self) -> messages.ModelResponse: """Build a ModelResponse from the data received from the stream so far.""" return self._ensure_stream_ready().get() def usage(self) -> RequestUsage: """Get the usage of the response so far.""" return self._ensure_stream_ready().usage() @property def model_name(self) -> str: """Get the model name of the response.""" return self._ensure_stream_ready().model_name @property def timestamp(self) -> datetime: """Get the timestamp of the response.""" return self._ensure_stream_ready().timestamp ``` #### __iter__ ```python __iter__() -> Iterator[ModelResponseStreamEvent] ``` Stream the response as an iterable of ModelResponseStreamEvents. Source code in `pydantic_ai_slim/pydantic_ai/direct.py` ```python def __iter__(self) -> Iterator[messages.ModelResponseStreamEvent]: """Stream the response as an iterable of [`ModelResponseStreamEvent`][pydantic_ai.messages.ModelResponseStreamEvent]s.""" self._check_context_manager_usage() while True: item = self._queue.get() if item is None: # End of stream break elif isinstance(item, Exception): raise item else: yield item ``` #### get ```python get() -> ModelResponse ``` Build a ModelResponse from the data received from the stream so far. Source code in `pydantic_ai_slim/pydantic_ai/direct.py` ```python def get(self) -> messages.ModelResponse: """Build a ModelResponse from the data received from the stream so far.""" return self._ensure_stream_ready().get() ``` #### usage ```python usage() -> RequestUsage ``` Get the usage of the response so far. Source code in `pydantic_ai_slim/pydantic_ai/direct.py` ```python def usage(self) -> RequestUsage: """Get the usage of the response so far.""" return self._ensure_stream_ready().usage() ``` #### model_name ```python model_name: str ``` Get the model name of the response. #### timestamp ```python timestamp: datetime ``` Get the timestamp of the response. # `pydantic_ai.durable_exec` ### TemporalAgent Bases: `WrapperAgent[AgentDepsT, OutputDataT]` Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/temporal/_agent.py` ````python class TemporalAgent(WrapperAgent[AgentDepsT, OutputDataT]): def __init__( self, wrapped: AbstractAgent[AgentDepsT, OutputDataT], *, name: str | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, activity_config: ActivityConfig | None = None, model_activity_config: ActivityConfig | None = None, toolset_activity_config: dict[str, ActivityConfig] | None = None, tool_activity_config: dict[str, dict[str, ActivityConfig | Literal[False]]] | None = None, run_context_type: type[TemporalRunContext[AgentDepsT]] = TemporalRunContext[AgentDepsT], temporalize_toolset_func: Callable[ [ AbstractToolset[AgentDepsT], str, ActivityConfig, dict[str, ActivityConfig | Literal[False]], type[AgentDepsT], type[TemporalRunContext[AgentDepsT]], ], AbstractToolset[AgentDepsT], ] = temporalize_toolset, ): """Wrap an agent to enable it to be used inside a Temporal workflow, by automatically offloading model requests, tool calls, and MCP server communication to Temporal activities. After wrapping, the original agent can still be used as normal outside of the Temporal workflow, but any changes to its model or toolsets after wrapping will not be reflected in the durable agent. Args: wrapped: The agent to wrap. name: Optional unique agent name to use in the Temporal activities' names. If not provided, the agent's `name` will be used. event_stream_handler: Optional event stream handler to use instead of the one set on the wrapped agent. activity_config: The base Temporal activity config to use for all activities. If no config is provided, a `start_to_close_timeout` of 60 seconds is used. model_activity_config: The Temporal activity config to use for model request activities. This is merged with the base activity config. toolset_activity_config: The Temporal activity config to use for get-tools and call-tool activities for specific toolsets identified by ID. This is merged with the base activity config. tool_activity_config: The Temporal activity config to use for specific tool call activities identified by toolset ID and tool name. This is merged with the base and toolset-specific activity configs. If a tool does not use IO, you can specify `False` to disable using an activity. Note that the tool is required to be defined as an `async` function as non-async tools are run in threads which are non-deterministic and thus not supported outside of activities. run_context_type: The `TemporalRunContext` subclass to use to serialize and deserialize the run context for use inside a Temporal activity. By default, only the `deps`, `retries`, `tool_call_id`, `tool_name`, `retry` and `run_step` attributes will be available. To make another attribute available, create a `TemporalRunContext` subclass with a custom `serialize_run_context` class method that returns a dictionary that includes the attribute. temporalize_toolset_func: Optional function to use to prepare "leaf" toolsets (i.e. those that implement their own tool listing and calling) for Temporal by wrapping them in a `TemporalWrapperToolset` that moves methods that require IO to Temporal activities. If not provided, only `FunctionToolset` and `MCPServer` will be prepared for Temporal. The function takes the toolset, the activity name prefix, the toolset-specific activity config, the tool-specific activity configs and the run context type. """ super().__init__(wrapped) self._name = name self._event_stream_handler = event_stream_handler self.run_context_type = run_context_type # start_to_close_timeout is required activity_config = activity_config or ActivityConfig(start_to_close_timeout=timedelta(seconds=60)) # `pydantic_ai.exceptions.UserError` and `pydantic.errors.PydanticUserError` are not retryable retry_policy = activity_config.get('retry_policy') or RetryPolicy() retry_policy.non_retryable_error_types = [ *(retry_policy.non_retryable_error_types or []), UserError.__name__, PydanticUserError.__name__, ] activity_config['retry_policy'] = retry_policy self.activity_config = activity_config model_activity_config = model_activity_config or {} toolset_activity_config = toolset_activity_config or {} tool_activity_config = tool_activity_config or {} if self.name is None: raise UserError( "An agent needs to have a unique `name` in order to be used with Temporal. The name will be used to identify the agent's activities within the workflow." ) activity_name_prefix = f'agent__{self.name}' activities: list[Callable[..., Any]] = [] if not isinstance(wrapped.model, Model): raise UserError( 'An agent needs to have a `model` in order to be used with Temporal, it cannot be set at agent run time.' ) async def event_stream_handler_activity(params: _EventStreamHandlerParams, deps: AgentDepsT) -> None: # We can never get here without an `event_stream_handler`, as `TemporalAgent.run_stream` and `TemporalAgent.iter` raise an error saying to use `TemporalAgent.run` instead, # and that only ends up calling `event_stream_handler` if it is set. assert self.event_stream_handler is not None run_context = self.run_context_type.deserialize_run_context(params.serialized_run_context, deps=deps) async def streamed_response(): yield params.event await self.event_stream_handler(run_context, streamed_response()) # Set type hint explicitly so that Temporal can take care of serialization and deserialization event_stream_handler_activity.__annotations__['deps'] = self.deps_type self.event_stream_handler_activity = activity.defn(name=f'{activity_name_prefix}__event_stream_handler')( event_stream_handler_activity ) activities.append(self.event_stream_handler_activity) temporal_model = TemporalModel( wrapped.model, activity_name_prefix=activity_name_prefix, activity_config=activity_config | model_activity_config, deps_type=self.deps_type, run_context_type=self.run_context_type, event_stream_handler=self.event_stream_handler, ) activities.extend(temporal_model.temporal_activities) def temporalize_toolset(toolset: AbstractToolset[AgentDepsT]) -> AbstractToolset[AgentDepsT]: id = toolset.id if id is None: raise UserError( "Toolsets that are 'leaves' (i.e. those that implement their own tool listing and calling) need to have a unique `id` in order to be used with Temporal. The ID will be used to identify the toolset's activities within the workflow." ) toolset = temporalize_toolset_func( toolset, activity_name_prefix, activity_config | toolset_activity_config.get(id, {}), tool_activity_config.get(id, {}), self.deps_type, self.run_context_type, ) if isinstance(toolset, TemporalWrapperToolset): activities.extend(toolset.temporal_activities) return toolset temporal_toolsets = [toolset.visit_and_replace(temporalize_toolset) for toolset in wrapped.toolsets] self._model = temporal_model self._toolsets = temporal_toolsets self._temporal_activities = activities self._temporal_overrides_active: ContextVar[bool] = ContextVar('_temporal_overrides_active', default=False) @property def name(self) -> str | None: return self._name or super().name @name.setter def name(self, value: str | None) -> None: # pragma: no cover raise UserError( 'The agent name cannot be changed after creation. If you need to change the name, create a new agent.' ) @property def model(self) -> Model: return self._model @property def event_stream_handler(self) -> EventStreamHandler[AgentDepsT] | None: handler = self._event_stream_handler or super().event_stream_handler if handler is None: return None elif workflow.in_workflow(): return self._call_event_stream_handler_activity else: return handler async def _call_event_stream_handler_activity( self, ctx: RunContext[AgentDepsT], stream: AsyncIterable[_messages.AgentStreamEvent] ) -> None: serialized_run_context = self.run_context_type.serialize_run_context(ctx) async for event in stream: await workflow.execute_activity( # pyright: ignore[reportUnknownMemberType] activity=self.event_stream_handler_activity, args=[ _EventStreamHandlerParams( event=event, serialized_run_context=serialized_run_context, ), ctx.deps, ], **self.activity_config, ) @property def toolsets(self) -> Sequence[AbstractToolset[AgentDepsT]]: with self._temporal_overrides(): return super().toolsets @property def temporal_activities(self) -> list[Callable[..., Any]]: return self._temporal_activities @contextmanager def _temporal_overrides(self) -> Iterator[None]: # We reset tools here as the temporalized function toolset is already in self._toolsets. with super().override(model=self._model, toolsets=self._toolsets, tools=[]): token = self._temporal_overrides_active.set(True) try: yield except PydanticSerializationError as e: raise UserError( "The `deps` object failed to be serialized. Temporal requires all objects that are passed to activities to be serializable using Pydantic's `TypeAdapter`." ) from e finally: self._temporal_overrides_active.reset(token) @overload async def run( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[OutputDataT]: ... @overload async def run( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[RunOutputDataT]: ... async def run( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AgentRunResult[Any]: """Run the agent with a user prompt in async mode. This method builds an internal agent graph (using system prompts, tools and result schemas) and then runs the graph to completion. The result of the run is returned. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): agent_run = await agent.run('What is the capital of France?') print(agent_run.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. event_stream_handler: Optional event stream handler to use for this run. Returns: The result of the run. """ if workflow.in_workflow() and event_stream_handler is not None: raise UserError( 'Event stream handler cannot be set at agent run time inside a Temporal workflow, it must be set at agent creation time.' ) with self._temporal_overrides(): return await super().run( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, event_stream_handler=event_stream_handler or self.event_stream_handler, **_deprecated_kwargs, ) @overload def run_sync( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[OutputDataT]: ... @overload def run_sync( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[RunOutputDataT]: ... def run_sync( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AgentRunResult[Any]: """Synchronously run the agent with a user prompt. This is a convenience method that wraps [`self.run`][pydantic_ai.agent.AbstractAgent.run] with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') result_sync = agent.run_sync('What is the capital of Italy?') print(result_sync.output) #> The capital of Italy is Rome. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. event_stream_handler: Optional event stream handler to use for this run. Returns: The result of the run. """ if workflow.in_workflow(): raise UserError( '`agent.run_sync()` cannot be used inside a Temporal workflow. Use `await agent.run()` instead.' ) return super().run_sync( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, event_stream_handler=event_stream_handler, **_deprecated_kwargs, ) @overload def run_stream( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AbstractAsyncContextManager[StreamedRunResult[AgentDepsT, OutputDataT]]: ... @overload def run_stream( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AbstractAsyncContextManager[StreamedRunResult[AgentDepsT, RunOutputDataT]]: ... @asynccontextmanager async def run_stream( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AsyncIterator[StreamedRunResult[AgentDepsT, Any]]: """Run the agent with a user prompt in async mode, returning a streamed response. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): async with agent.run_stream('What is the capital of the UK?') as response: print(await response.get_output()) #> The capital of the UK is London. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. event_stream_handler: Optional event stream handler to use for this run. It will receive all the events up until the final result is found, which you can then read or stream from inside the context manager. Returns: The result of the run. """ if workflow.in_workflow(): raise UserError( '`agent.run_stream()` cannot currently be used inside a Temporal workflow. ' 'Set an `event_stream_handler` on the agent and use `agent.run()` instead. ' 'Please file an issue if this is not sufficient for your use case.' ) async with super().run_stream( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, event_stream_handler=event_stream_handler, **_deprecated_kwargs, ) as result: yield result @overload def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, **_deprecated_kwargs: Never, ) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, OutputDataT]]: ... @overload def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, **_deprecated_kwargs: Never, ) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, RunOutputDataT]]: ... @asynccontextmanager async def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, **_deprecated_kwargs: Never, ) -> AsyncIterator[AgentRun[AgentDepsT, Any]]: """A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions=None, instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. Returns: The result of the run. """ if workflow.in_workflow(): if not self._temporal_overrides_active.get(): raise UserError( '`agent.iter()` cannot currently be used inside a Temporal workflow. ' 'Set an `event_stream_handler` on the agent and use `agent.run()` instead. ' 'Please file an issue if this is not sufficient for your use case.' ) if model is not None: raise UserError( 'Model cannot be set at agent run time inside a Temporal workflow, it must be set at agent creation time.' ) if toolsets is not None: raise UserError( 'Toolsets cannot be set at agent run time inside a Temporal workflow, it must be set at agent creation time.' ) async with super().iter( user_prompt=user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, **_deprecated_kwargs, ) as run: yield run @contextmanager def override( self, *, deps: AgentDepsT | _utils.Unset = _utils.UNSET, model: models.Model | models.KnownModelName | str | _utils.Unset = _utils.UNSET, toolsets: Sequence[AbstractToolset[AgentDepsT]] | _utils.Unset = _utils.UNSET, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | _utils.Unset = _utils.UNSET, ) -> Iterator[None]: """Context manager to temporarily override agent dependencies, model, toolsets, or tools. This is particularly useful when testing. You can find an example of this [here](../testing.md#overriding-model-via-pytest-fixtures). Args: deps: The dependencies to use instead of the dependencies passed to the agent run. model: The model to use instead of the model passed to the agent run. toolsets: The toolsets to use instead of the toolsets passed to the agent constructor and agent run. tools: The tools to use instead of the tools registered with the agent. """ if workflow.in_workflow(): if _utils.is_set(model): raise UserError( 'Model cannot be contextually overridden inside a Temporal workflow, it must be set at agent creation time.' ) if _utils.is_set(toolsets): raise UserError( 'Toolsets cannot be contextually overridden inside a Temporal workflow, they must be set at agent creation time.' ) if _utils.is_set(tools): raise UserError( 'Tools cannot be contextually overridden inside a Temporal workflow, they must be set at agent creation time.' ) with super().override(deps=deps, model=model, toolsets=toolsets, tools=tools): yield ```` #### __init__ ```python __init__( wrapped: AbstractAgent[AgentDepsT, OutputDataT], *, name: str | None = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None, activity_config: ActivityConfig | None = None, model_activity_config: ActivityConfig | None = None, toolset_activity_config: ( dict[str, ActivityConfig] | None ) = None, tool_activity_config: ( dict[ str, dict[str, ActivityConfig | Literal[False]] ] | None ) = None, run_context_type: type[ TemporalRunContext[AgentDepsT] ] = TemporalRunContext[AgentDepsT], temporalize_toolset_func: Callable[ [ AbstractToolset[AgentDepsT], str, ActivityConfig, dict[str, ActivityConfig | Literal[False]], type[AgentDepsT], type[TemporalRunContext[AgentDepsT]], ], AbstractToolset[AgentDepsT], ] = temporalize_toolset ) ``` Wrap an agent to enable it to be used inside a Temporal workflow, by automatically offloading model requests, tool calls, and MCP server communication to Temporal activities. After wrapping, the original agent can still be used as normal outside of the Temporal workflow, but any changes to its model or toolsets after wrapping will not be reflected in the durable agent. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `wrapped` | `AbstractAgent[AgentDepsT, OutputDataT]` | The agent to wrap. | *required* | | `name` | `str | None` | Optional unique agent name to use in the Temporal activities' names. If not provided, the agent's name will be used. | `None` | | `event_stream_handler` | `EventStreamHandler[AgentDepsT] | None` | Optional event stream handler to use instead of the one set on the wrapped agent. | `None` | | `activity_config` | `ActivityConfig | None` | The base Temporal activity config to use for all activities. If no config is provided, a start_to_close_timeout of 60 seconds is used. | `None` | | `model_activity_config` | `ActivityConfig | None` | The Temporal activity config to use for model request activities. This is merged with the base activity config. | `None` | | `toolset_activity_config` | `dict[str, ActivityConfig] | None` | The Temporal activity config to use for get-tools and call-tool activities for specific toolsets identified by ID. This is merged with the base activity config. | `None` | | `tool_activity_config` | `dict[str, dict[str, ActivityConfig | Literal[False]]] | None` | The Temporal activity config to use for specific tool call activities identified by toolset ID and tool name. This is merged with the base and toolset-specific activity configs. If a tool does not use IO, you can specify False to disable using an activity. Note that the tool is required to be defined as an async function as non-async tools are run in threads which are non-deterministic and thus not supported outside of activities. | `None` | | `run_context_type` | `type[TemporalRunContext[AgentDepsT]]` | The TemporalRunContext subclass to use to serialize and deserialize the run context for use inside a Temporal activity. By default, only the deps, retries, tool_call_id, tool_name, retry and run_step attributes will be available. To make another attribute available, create a TemporalRunContext subclass with a custom serialize_run_context class method that returns a dictionary that includes the attribute. | `TemporalRunContext[AgentDepsT]` | | `temporalize_toolset_func` | `Callable[[AbstractToolset[AgentDepsT], str, ActivityConfig, dict[str, ActivityConfig | Literal[False]], type[AgentDepsT], type[TemporalRunContext[AgentDepsT]]], AbstractToolset[AgentDepsT]]` | Optional function to use to prepare "leaf" toolsets (i.e. those that implement their own tool listing and calling) for Temporal by wrapping them in a TemporalWrapperToolset that moves methods that require IO to Temporal activities. If not provided, only FunctionToolset and MCPServer will be prepared for Temporal. The function takes the toolset, the activity name prefix, the toolset-specific activity config, the tool-specific activity configs and the run context type. | `temporalize_toolset` | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/temporal/_agent.py` ```python def __init__( self, wrapped: AbstractAgent[AgentDepsT, OutputDataT], *, name: str | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, activity_config: ActivityConfig | None = None, model_activity_config: ActivityConfig | None = None, toolset_activity_config: dict[str, ActivityConfig] | None = None, tool_activity_config: dict[str, dict[str, ActivityConfig | Literal[False]]] | None = None, run_context_type: type[TemporalRunContext[AgentDepsT]] = TemporalRunContext[AgentDepsT], temporalize_toolset_func: Callable[ [ AbstractToolset[AgentDepsT], str, ActivityConfig, dict[str, ActivityConfig | Literal[False]], type[AgentDepsT], type[TemporalRunContext[AgentDepsT]], ], AbstractToolset[AgentDepsT], ] = temporalize_toolset, ): """Wrap an agent to enable it to be used inside a Temporal workflow, by automatically offloading model requests, tool calls, and MCP server communication to Temporal activities. After wrapping, the original agent can still be used as normal outside of the Temporal workflow, but any changes to its model or toolsets after wrapping will not be reflected in the durable agent. Args: wrapped: The agent to wrap. name: Optional unique agent name to use in the Temporal activities' names. If not provided, the agent's `name` will be used. event_stream_handler: Optional event stream handler to use instead of the one set on the wrapped agent. activity_config: The base Temporal activity config to use for all activities. If no config is provided, a `start_to_close_timeout` of 60 seconds is used. model_activity_config: The Temporal activity config to use for model request activities. This is merged with the base activity config. toolset_activity_config: The Temporal activity config to use for get-tools and call-tool activities for specific toolsets identified by ID. This is merged with the base activity config. tool_activity_config: The Temporal activity config to use for specific tool call activities identified by toolset ID and tool name. This is merged with the base and toolset-specific activity configs. If a tool does not use IO, you can specify `False` to disable using an activity. Note that the tool is required to be defined as an `async` function as non-async tools are run in threads which are non-deterministic and thus not supported outside of activities. run_context_type: The `TemporalRunContext` subclass to use to serialize and deserialize the run context for use inside a Temporal activity. By default, only the `deps`, `retries`, `tool_call_id`, `tool_name`, `retry` and `run_step` attributes will be available. To make another attribute available, create a `TemporalRunContext` subclass with a custom `serialize_run_context` class method that returns a dictionary that includes the attribute. temporalize_toolset_func: Optional function to use to prepare "leaf" toolsets (i.e. those that implement their own tool listing and calling) for Temporal by wrapping them in a `TemporalWrapperToolset` that moves methods that require IO to Temporal activities. If not provided, only `FunctionToolset` and `MCPServer` will be prepared for Temporal. The function takes the toolset, the activity name prefix, the toolset-specific activity config, the tool-specific activity configs and the run context type. """ super().__init__(wrapped) self._name = name self._event_stream_handler = event_stream_handler self.run_context_type = run_context_type # start_to_close_timeout is required activity_config = activity_config or ActivityConfig(start_to_close_timeout=timedelta(seconds=60)) # `pydantic_ai.exceptions.UserError` and `pydantic.errors.PydanticUserError` are not retryable retry_policy = activity_config.get('retry_policy') or RetryPolicy() retry_policy.non_retryable_error_types = [ *(retry_policy.non_retryable_error_types or []), UserError.__name__, PydanticUserError.__name__, ] activity_config['retry_policy'] = retry_policy self.activity_config = activity_config model_activity_config = model_activity_config or {} toolset_activity_config = toolset_activity_config or {} tool_activity_config = tool_activity_config or {} if self.name is None: raise UserError( "An agent needs to have a unique `name` in order to be used with Temporal. The name will be used to identify the agent's activities within the workflow." ) activity_name_prefix = f'agent__{self.name}' activities: list[Callable[..., Any]] = [] if not isinstance(wrapped.model, Model): raise UserError( 'An agent needs to have a `model` in order to be used with Temporal, it cannot be set at agent run time.' ) async def event_stream_handler_activity(params: _EventStreamHandlerParams, deps: AgentDepsT) -> None: # We can never get here without an `event_stream_handler`, as `TemporalAgent.run_stream` and `TemporalAgent.iter` raise an error saying to use `TemporalAgent.run` instead, # and that only ends up calling `event_stream_handler` if it is set. assert self.event_stream_handler is not None run_context = self.run_context_type.deserialize_run_context(params.serialized_run_context, deps=deps) async def streamed_response(): yield params.event await self.event_stream_handler(run_context, streamed_response()) # Set type hint explicitly so that Temporal can take care of serialization and deserialization event_stream_handler_activity.__annotations__['deps'] = self.deps_type self.event_stream_handler_activity = activity.defn(name=f'{activity_name_prefix}__event_stream_handler')( event_stream_handler_activity ) activities.append(self.event_stream_handler_activity) temporal_model = TemporalModel( wrapped.model, activity_name_prefix=activity_name_prefix, activity_config=activity_config | model_activity_config, deps_type=self.deps_type, run_context_type=self.run_context_type, event_stream_handler=self.event_stream_handler, ) activities.extend(temporal_model.temporal_activities) def temporalize_toolset(toolset: AbstractToolset[AgentDepsT]) -> AbstractToolset[AgentDepsT]: id = toolset.id if id is None: raise UserError( "Toolsets that are 'leaves' (i.e. those that implement their own tool listing and calling) need to have a unique `id` in order to be used with Temporal. The ID will be used to identify the toolset's activities within the workflow." ) toolset = temporalize_toolset_func( toolset, activity_name_prefix, activity_config | toolset_activity_config.get(id, {}), tool_activity_config.get(id, {}), self.deps_type, self.run_context_type, ) if isinstance(toolset, TemporalWrapperToolset): activities.extend(toolset.temporal_activities) return toolset temporal_toolsets = [toolset.visit_and_replace(temporalize_toolset) for toolset in wrapped.toolsets] self._model = temporal_model self._toolsets = temporal_toolsets self._temporal_activities = activities self._temporal_overrides_active: ContextVar[bool] = ContextVar('_temporal_overrides_active', default=False) ``` #### run ```python run( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: list[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AgentRunResult[OutputDataT] ``` ```python run( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: list[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AgentRunResult[RunOutputDataT] ``` ```python run( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: list[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None, **_deprecated_kwargs: Never ) -> AgentRunResult[Any] ``` Run the agent with a user prompt in async mode. This method builds an internal agent graph (using system prompts, tools and result schemas) and then runs the graph to completion. The result of the run is returned. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): agent_run = await agent.run('What is the capital of France?') print(agent_run.output) #> The capital of France is Paris. ``` Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `user_prompt` | `str | Sequence[UserContent] | None` | User input to start/continue the conversation. | `None` | | `output_type` | `OutputSpec[RunOutputDataT] | None` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | `None` | | `message_history` | `list[ModelMessage] | None` | History of the conversation so far. | `None` | | `deferred_tool_results` | `DeferredToolResults | None` | Optional results for deferred tool calls in the message history. | `None` | | `model` | `Model | KnownModelName | str | None` | Optional model to use for this run, required if model was not set when creating the agent. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | `ModelSettings | None` | Optional settings to use for this model's request. | `None` | | `usage_limits` | `UsageLimits | None` | Optional limits on model request count or token usage. | `None` | | `usage` | `RunUsage | None` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | `None` | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | None` | Optional additional toolsets for this run. | `None` | | `event_stream_handler` | `EventStreamHandler[AgentDepsT] | None` | Optional event stream handler to use for this run. | `None` | Returns: | Type | Description | | --- | --- | | `AgentRunResult[Any]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/temporal/_agent.py` ````python async def run( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AgentRunResult[Any]: """Run the agent with a user prompt in async mode. This method builds an internal agent graph (using system prompts, tools and result schemas) and then runs the graph to completion. The result of the run is returned. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): agent_run = await agent.run('What is the capital of France?') print(agent_run.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. event_stream_handler: Optional event stream handler to use for this run. Returns: The result of the run. """ if workflow.in_workflow() and event_stream_handler is not None: raise UserError( 'Event stream handler cannot be set at agent run time inside a Temporal workflow, it must be set at agent creation time.' ) with self._temporal_overrides(): return await super().run( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, event_stream_handler=event_stream_handler or self.event_stream_handler, **_deprecated_kwargs, ) ```` #### run_sync ```python run_sync( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: list[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AgentRunResult[OutputDataT] ``` ```python run_sync( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: list[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AgentRunResult[RunOutputDataT] ``` ```python run_sync( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: list[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None, **_deprecated_kwargs: Never ) -> AgentRunResult[Any] ``` Synchronously run the agent with a user prompt. This is a convenience method that wraps self.run with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') result_sync = agent.run_sync('What is the capital of Italy?') print(result_sync.output) #> The capital of Italy is Rome. ``` Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `user_prompt` | `str | Sequence[UserContent] | None` | User input to start/continue the conversation. | `None` | | `output_type` | `OutputSpec[RunOutputDataT] | None` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | `None` | | `message_history` | `list[ModelMessage] | None` | History of the conversation so far. | `None` | | `deferred_tool_results` | `DeferredToolResults | None` | Optional results for deferred tool calls in the message history. | `None` | | `model` | `Model | KnownModelName | str | None` | Optional model to use for this run, required if model was not set when creating the agent. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | `ModelSettings | None` | Optional settings to use for this model's request. | `None` | | `usage_limits` | `UsageLimits | None` | Optional limits on model request count or token usage. | `None` | | `usage` | `RunUsage | None` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | `None` | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | None` | Optional additional toolsets for this run. | `None` | | `event_stream_handler` | `EventStreamHandler[AgentDepsT] | None` | Optional event stream handler to use for this run. | `None` | Returns: | Type | Description | | --- | --- | | `AgentRunResult[Any]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/temporal/_agent.py` ````python def run_sync( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AgentRunResult[Any]: """Synchronously run the agent with a user prompt. This is a convenience method that wraps [`self.run`][pydantic_ai.agent.AbstractAgent.run] with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') result_sync = agent.run_sync('What is the capital of Italy?') print(result_sync.output) #> The capital of Italy is Rome. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. event_stream_handler: Optional event stream handler to use for this run. Returns: The result of the run. """ if workflow.in_workflow(): raise UserError( '`agent.run_sync()` cannot be used inside a Temporal workflow. Use `await agent.run()` instead.' ) return super().run_sync( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, event_stream_handler=event_stream_handler, **_deprecated_kwargs, ) ```` #### run_stream ```python run_stream( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: list[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AbstractAsyncContextManager[ StreamedRunResult[AgentDepsT, OutputDataT] ] ``` ```python run_stream( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: list[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AbstractAsyncContextManager[ StreamedRunResult[AgentDepsT, RunOutputDataT] ] ``` ```python run_stream( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: list[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None, **_deprecated_kwargs: Never ) -> AsyncIterator[StreamedRunResult[AgentDepsT, Any]] ``` Run the agent with a user prompt in async mode, returning a streamed response. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): async with agent.run_stream('What is the capital of the UK?') as response: print(await response.get_output()) #> The capital of the UK is London. ``` Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `user_prompt` | `str | Sequence[UserContent] | None` | User input to start/continue the conversation. | `None` | | `output_type` | `OutputSpec[RunOutputDataT] | None` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | `None` | | `message_history` | `list[ModelMessage] | None` | History of the conversation so far. | `None` | | `deferred_tool_results` | `DeferredToolResults | None` | Optional results for deferred tool calls in the message history. | `None` | | `model` | `Model | KnownModelName | str | None` | Optional model to use for this run, required if model was not set when creating the agent. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | `ModelSettings | None` | Optional settings to use for this model's request. | `None` | | `usage_limits` | `UsageLimits | None` | Optional limits on model request count or token usage. | `None` | | `usage` | `RunUsage | None` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | `None` | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | None` | Optional additional toolsets for this run. | `None` | | `event_stream_handler` | `EventStreamHandler[AgentDepsT] | None` | Optional event stream handler to use for this run. It will receive all the events up until the final result is found, which you can then read or stream from inside the context manager. | `None` | Returns: | Type | Description | | --- | --- | | `AsyncIterator[StreamedRunResult[AgentDepsT, Any]]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/temporal/_agent.py` ````python @asynccontextmanager async def run_stream( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AsyncIterator[StreamedRunResult[AgentDepsT, Any]]: """Run the agent with a user prompt in async mode, returning a streamed response. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): async with agent.run_stream('What is the capital of the UK?') as response: print(await response.get_output()) #> The capital of the UK is London. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. event_stream_handler: Optional event stream handler to use for this run. It will receive all the events up until the final result is found, which you can then read or stream from inside the context manager. Returns: The result of the run. """ if workflow.in_workflow(): raise UserError( '`agent.run_stream()` cannot currently be used inside a Temporal workflow. ' 'Set an `event_stream_handler` on the agent and use `agent.run()` instead. ' 'Please file an issue if this is not sufficient for your use case.' ) async with super().run_stream( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, event_stream_handler=event_stream_handler, **_deprecated_kwargs, ) as result: yield result ```` #### iter ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: list[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, **_deprecated_kwargs: Never ) -> AbstractAsyncContextManager[ AgentRun[AgentDepsT, OutputDataT] ] ``` ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: list[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, **_deprecated_kwargs: Never ) -> AbstractAsyncContextManager[ AgentRun[AgentDepsT, RunOutputDataT] ] ``` ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: list[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, **_deprecated_kwargs: Never ) -> AsyncIterator[AgentRun[AgentDepsT, Any]] ``` A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions=None, instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `user_prompt` | `str | Sequence[UserContent] | None` | User input to start/continue the conversation. | `None` | | `output_type` | `OutputSpec[RunOutputDataT] | None` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | `None` | | `message_history` | `list[ModelMessage] | None` | History of the conversation so far. | `None` | | `deferred_tool_results` | `DeferredToolResults | None` | Optional results for deferred tool calls in the message history. | `None` | | `model` | `Model | KnownModelName | str | None` | Optional model to use for this run, required if model was not set when creating the agent. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | `ModelSettings | None` | Optional settings to use for this model's request. | `None` | | `usage_limits` | `UsageLimits | None` | Optional limits on model request count or token usage. | `None` | | `usage` | `RunUsage | None` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | `None` | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | None` | Optional additional toolsets for this run. | `None` | Returns: | Type | Description | | --- | --- | | `AsyncIterator[AgentRun[AgentDepsT, Any]]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/temporal/_agent.py` ````python @asynccontextmanager async def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, **_deprecated_kwargs: Never, ) -> AsyncIterator[AgentRun[AgentDepsT, Any]]: """A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions=None, instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. Returns: The result of the run. """ if workflow.in_workflow(): if not self._temporal_overrides_active.get(): raise UserError( '`agent.iter()` cannot currently be used inside a Temporal workflow. ' 'Set an `event_stream_handler` on the agent and use `agent.run()` instead. ' 'Please file an issue if this is not sufficient for your use case.' ) if model is not None: raise UserError( 'Model cannot be set at agent run time inside a Temporal workflow, it must be set at agent creation time.' ) if toolsets is not None: raise UserError( 'Toolsets cannot be set at agent run time inside a Temporal workflow, it must be set at agent creation time.' ) async with super().iter( user_prompt=user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, **_deprecated_kwargs, ) as run: yield run ```` #### override ```python override( *, deps: AgentDepsT | Unset = UNSET, model: Model | KnownModelName | str | Unset = UNSET, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | Unset ) = UNSET, tools: ( Sequence[ Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...] ] | Unset ) = UNSET ) -> Iterator[None] ``` Context manager to temporarily override agent dependencies, model, toolsets, or tools. This is particularly useful when testing. You can find an example of this [here](../../testing/#overriding-model-via-pytest-fixtures). Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `deps` | `AgentDepsT | Unset` | The dependencies to use instead of the dependencies passed to the agent run. | `UNSET` | | `model` | `Model | KnownModelName | str | Unset` | The model to use instead of the model passed to the agent run. | `UNSET` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | Unset` | The toolsets to use instead of the toolsets passed to the agent constructor and agent run. | `UNSET` | | `tools` | `Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | Unset` | The tools to use instead of the tools registered with the agent. | `UNSET` | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/temporal/_agent.py` ```python @contextmanager def override( self, *, deps: AgentDepsT | _utils.Unset = _utils.UNSET, model: models.Model | models.KnownModelName | str | _utils.Unset = _utils.UNSET, toolsets: Sequence[AbstractToolset[AgentDepsT]] | _utils.Unset = _utils.UNSET, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | _utils.Unset = _utils.UNSET, ) -> Iterator[None]: """Context manager to temporarily override agent dependencies, model, toolsets, or tools. This is particularly useful when testing. You can find an example of this [here](../testing.md#overriding-model-via-pytest-fixtures). Args: deps: The dependencies to use instead of the dependencies passed to the agent run. model: The model to use instead of the model passed to the agent run. toolsets: The toolsets to use instead of the toolsets passed to the agent constructor and agent run. tools: The tools to use instead of the tools registered with the agent. """ if workflow.in_workflow(): if _utils.is_set(model): raise UserError( 'Model cannot be contextually overridden inside a Temporal workflow, it must be set at agent creation time.' ) if _utils.is_set(toolsets): raise UserError( 'Toolsets cannot be contextually overridden inside a Temporal workflow, they must be set at agent creation time.' ) if _utils.is_set(tools): raise UserError( 'Tools cannot be contextually overridden inside a Temporal workflow, they must be set at agent creation time.' ) with super().override(deps=deps, model=model, toolsets=toolsets, tools=tools): yield ``` ### LogfirePlugin Bases: `Plugin` Temporal client plugin for Logfire. Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/temporal/_logfire.py` ```python class LogfirePlugin(ClientPlugin): """Temporal client plugin for Logfire.""" def __init__(self, setup_logfire: Callable[[], Logfire] = _default_setup_logfire, *, metrics: bool = True): self.setup_logfire = setup_logfire self.metrics = metrics def init_client_plugin(self, next: ClientPlugin) -> None: self.next_client_plugin = next def configure_client(self, config: ClientConfig) -> ClientConfig: interceptors = config.get('interceptors', []) config['interceptors'] = [*interceptors, TracingInterceptor(get_tracer('temporalio'))] return self.next_client_plugin.configure_client(config) async def connect_service_client(self, config: ConnectConfig) -> ServiceClient: logfire = self.setup_logfire() if self.metrics: logfire_config = logfire.config token = logfire_config.token if logfire_config.send_to_logfire and token is not None and logfire_config.metrics is not False: base_url = logfire_config.advanced.generate_base_url(token) metrics_url = base_url + '/v1/metrics' headers = {'Authorization': f'Bearer {token}'} config.runtime = Runtime( telemetry=TelemetryConfig(metrics=OpenTelemetryConfig(url=metrics_url, headers=headers)) ) return await self.next_client_plugin.connect_service_client(config) ``` ### TemporalRunContext Bases: `RunContext[AgentDepsT]` The RunContext subclass to use to serialize and deserialize the run context for use inside a Temporal activity. By default, only the `deps`, `retries`, `tool_call_id`, `tool_name`, `tool_call_approved`, `retry` and `run_step` attributes will be available. To make another attribute available, create a `TemporalRunContext` subclass with a custom `serialize_run_context` class method that returns a dictionary that includes the attribute and pass it to TemporalAgent. Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/temporal/_run_context.py` ```python class TemporalRunContext(RunContext[AgentDepsT]): """The [`RunContext`][pydantic_ai.tools.RunContext] subclass to use to serialize and deserialize the run context for use inside a Temporal activity. By default, only the `deps`, `retries`, `tool_call_id`, `tool_name`, `tool_call_approved`, `retry` and `run_step` attributes will be available. To make another attribute available, create a `TemporalRunContext` subclass with a custom `serialize_run_context` class method that returns a dictionary that includes the attribute and pass it to [`TemporalAgent`][pydantic_ai.durable_exec.temporal.TemporalAgent]. """ def __init__(self, deps: AgentDepsT, **kwargs: Any): self.__dict__ = {**kwargs, 'deps': deps} setattr( self, '__dataclass_fields__', {name: field for name, field in RunContext.__dataclass_fields__.items() if name in self.__dict__}, ) def __getattribute__(self, name: str) -> Any: try: return super().__getattribute__(name) except AttributeError as e: # pragma: no cover if name in RunContext.__dataclass_fields__: raise UserError( f'{self.__class__.__name__!r} object has no attribute {name!r}. ' 'To make the attribute available, create a `TemporalRunContext` subclass with a custom `serialize_run_context` class method that returns a dictionary that includes the attribute and pass it to `TemporalAgent`.' ) else: raise e @classmethod def serialize_run_context(cls, ctx: RunContext[Any]) -> dict[str, Any]: """Serialize the run context to a `dict[str, Any]`.""" return { 'retries': ctx.retries, 'tool_call_id': ctx.tool_call_id, 'tool_name': ctx.tool_name, 'tool_call_approved': ctx.tool_call_approved, 'retry': ctx.retry, 'run_step': ctx.run_step, } @classmethod def deserialize_run_context(cls, ctx: dict[str, Any], deps: AgentDepsT) -> TemporalRunContext[AgentDepsT]: """Deserialize the run context from a `dict[str, Any]`.""" return cls(**ctx, deps=deps) ``` #### serialize_run_context ```python serialize_run_context( ctx: RunContext[Any], ) -> dict[str, Any] ``` Serialize the run context to a `dict[str, Any]`. Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/temporal/_run_context.py` ```python @classmethod def serialize_run_context(cls, ctx: RunContext[Any]) -> dict[str, Any]: """Serialize the run context to a `dict[str, Any]`.""" return { 'retries': ctx.retries, 'tool_call_id': ctx.tool_call_id, 'tool_name': ctx.tool_name, 'tool_call_approved': ctx.tool_call_approved, 'retry': ctx.retry, 'run_step': ctx.run_step, } ``` #### deserialize_run_context ```python deserialize_run_context( ctx: dict[str, Any], deps: AgentDepsT ) -> TemporalRunContext[AgentDepsT] ``` Deserialize the run context from a `dict[str, Any]`. Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/temporal/_run_context.py` ```python @classmethod def deserialize_run_context(cls, ctx: dict[str, Any], deps: AgentDepsT) -> TemporalRunContext[AgentDepsT]: """Deserialize the run context from a `dict[str, Any]`.""" return cls(**ctx, deps=deps) ``` ### PydanticAIPlugin Bases: `Plugin`, `Plugin` Temporal client and worker plugin for Pydantic AI. Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/temporal/__init__.py` ```python class PydanticAIPlugin(ClientPlugin, WorkerPlugin): """Temporal client and worker plugin for Pydantic AI.""" def init_client_plugin(self, next: ClientPlugin) -> None: self.next_client_plugin = next def init_worker_plugin(self, next: WorkerPlugin) -> None: self.next_worker_plugin = next def configure_client(self, config: ClientConfig) -> ClientConfig: config['data_converter'] = self._get_new_data_converter(config.get('data_converter')) return self.next_client_plugin.configure_client(config) def configure_worker(self, config: WorkerConfig) -> WorkerConfig: runner = config.get('workflow_runner') # pyright: ignore[reportUnknownMemberType] if isinstance(runner, SandboxedWorkflowRunner): # pragma: no branch config['workflow_runner'] = replace( runner, restrictions=runner.restrictions.with_passthrough_modules( 'pydantic_ai', 'pydantic', 'pydantic_core', 'logfire', 'rich', 'httpx', # Imported inside `logfire._internal.json_encoder` when running `logfire.info` inside an activity with attributes to serialize 'attrs', # Imported inside `logfire._internal.json_schema` when running `logfire.info` inside an activity with attributes to serialize 'numpy', 'pandas', ), ) config['workflow_failure_exception_types'] = [ *config.get('workflow_failure_exception_types', []), # pyright: ignore[reportUnknownMemberType] UserError, PydanticUserError, ] return self.next_worker_plugin.configure_worker(config) async def connect_service_client(self, config: ConnectConfig) -> ServiceClient: return await self.next_client_plugin.connect_service_client(config) async def run_worker(self, worker: Worker) -> None: await self.next_worker_plugin.run_worker(worker) def configure_replayer(self, config: ReplayerConfig) -> ReplayerConfig: # pragma: no cover config['data_converter'] = self._get_new_data_converter(config.get('data_converter')) # pyright: ignore[reportUnknownMemberType] return self.next_worker_plugin.configure_replayer(config) def run_replayer( self, replayer: Replayer, histories: AsyncIterator[WorkflowHistory], ) -> AbstractAsyncContextManager[AsyncIterator[WorkflowReplayResult]]: # pragma: no cover return self.next_worker_plugin.run_replayer(replayer, histories) def _get_new_data_converter(self, converter: DataConverter | None) -> DataConverter: if converter and converter.payload_converter_class not in ( DefaultPayloadConverter, PydanticPayloadConverter, ): warnings.warn( # pragma: no cover 'A non-default Temporal data converter was used which has been replaced with the Pydantic data converter.' ) return pydantic_data_converter ``` ### AgentPlugin Bases: `Plugin` Temporal worker plugin for a specific Pydantic AI agent. Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/temporal/__init__.py` ```python class AgentPlugin(WorkerPlugin): """Temporal worker plugin for a specific Pydantic AI agent.""" def __init__(self, agent: TemporalAgent[Any, Any]): self.agent = agent def init_worker_plugin(self, next: WorkerPlugin) -> None: self.next_worker_plugin = next def configure_worker(self, config: WorkerConfig) -> WorkerConfig: activities: Sequence[Callable[..., Any]] = config.get('activities', []) # pyright: ignore[reportUnknownMemberType] # Activities are checked for name conflicts by Temporal. config['activities'] = [*activities, *self.agent.temporal_activities] return self.next_worker_plugin.configure_worker(config) async def run_worker(self, worker: Worker) -> None: await self.next_worker_plugin.run_worker(worker) def configure_replayer(self, config: ReplayerConfig) -> ReplayerConfig: # pragma: no cover return self.next_worker_plugin.configure_replayer(config) def run_replayer( self, replayer: Replayer, histories: AsyncIterator[WorkflowHistory], ) -> AbstractAsyncContextManager[AsyncIterator[WorkflowReplayResult]]: # pragma: no cover return self.next_worker_plugin.run_replayer(replayer, histories) ``` ### DBOSAgent Bases: `WrapperAgent[AgentDepsT, OutputDataT]`, `DBOSConfiguredInstance` Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/dbos/_agent.py` ````python @DBOS.dbos_class() class DBOSAgent(WrapperAgent[AgentDepsT, OutputDataT], DBOSConfiguredInstance): def __init__( self, wrapped: AbstractAgent[AgentDepsT, OutputDataT], *, name: str | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, mcp_step_config: StepConfig | None = None, model_step_config: StepConfig | None = None, ): """Wrap an agent to enable it with DBOS durable workflows, by automatically offloading model requests, tool calls, and MCP server communication to DBOS steps. After wrapping, the original agent can still be used as normal outside of the DBOS workflow. Args: wrapped: The agent to wrap. name: Optional unique agent name to use as the DBOS configured instance name. If not provided, the agent's `name` will be used. event_stream_handler: Optional event stream handler to use instead of the one set on the wrapped agent. mcp_step_config: The base DBOS step config to use for MCP server steps. If no config is provided, use the default settings of DBOS. model_step_config: The DBOS step config to use for model request steps. If no config is provided, use the default settings of DBOS. """ super().__init__(wrapped) self._name = name or wrapped.name self._event_stream_handler = event_stream_handler if self._name is None: raise UserError( "An agent needs to have a unique `name` in order to be used with DBOS. The name will be used to identify the agent's workflows and steps." ) # Merge the config with the default DBOS config self._mcp_step_config = mcp_step_config or {} self._model_step_config = model_step_config or {} if not isinstance(wrapped.model, Model): raise UserError( 'An agent needs to have a `model` in order to be used with DBOS, it cannot be set at agent run time.' ) dbos_model = DBOSModel( wrapped.model, step_name_prefix=self._name, step_config=self._model_step_config, event_stream_handler=self.event_stream_handler, ) self._model = dbos_model dbosagent_name = self._name def dbosify_toolset(toolset: AbstractToolset[AgentDepsT]) -> AbstractToolset[AgentDepsT]: # Replace MCPServer with DBOSMCPServer if isinstance(toolset, MCPServer): return DBOSMCPServer( wrapped=toolset, step_name_prefix=dbosagent_name, step_config=self._mcp_step_config, ) else: return toolset dbos_toolsets = [toolset.visit_and_replace(dbosify_toolset) for toolset in wrapped.toolsets] self._toolsets = dbos_toolsets DBOSConfiguredInstance.__init__(self, self._name) # Wrap the `run` method in a DBOS workflow @DBOS.workflow(name=f'{self._name}.run') async def wrapped_run_workflow( user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AgentRunResult[Any]: with self._dbos_overrides(): return await super(WrapperAgent, self).run( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, event_stream_handler=event_stream_handler, **_deprecated_kwargs, ) self.dbos_wrapped_run_workflow = wrapped_run_workflow # Wrap the `run_sync` method in a DBOS workflow @DBOS.workflow(name=f'{self._name}.run_sync') def wrapped_run_sync_workflow( user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AgentRunResult[Any]: with self._dbos_overrides(): return super(DBOSAgent, self).run_sync( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, event_stream_handler=event_stream_handler, **_deprecated_kwargs, ) self.dbos_wrapped_run_sync_workflow = wrapped_run_sync_workflow @property def name(self) -> str | None: return self._name @name.setter def name(self, value: str | None) -> None: # pragma: no cover raise UserError( 'The agent name cannot be changed after creation. If you need to change the name, create a new agent.' ) @property def model(self) -> Model: return self._model @property def event_stream_handler(self) -> EventStreamHandler[AgentDepsT] | None: handler = self._event_stream_handler or super().event_stream_handler if handler is None: return None elif DBOS.workflow_id is not None and DBOS.step_id is None: # Special case if it's in a DBOS workflow but not a step, we need to iterate through all events and call the handler. return self._call_event_stream_handler_in_workflow else: return handler async def _call_event_stream_handler_in_workflow( self, ctx: RunContext[AgentDepsT], stream: AsyncIterable[_messages.AgentStreamEvent] ) -> None: handler = self._event_stream_handler or super().event_stream_handler assert handler is not None async def streamed_response(event: _messages.AgentStreamEvent): yield event async for event in stream: await handler(ctx, streamed_response(event)) @property def toolsets(self) -> Sequence[AbstractToolset[AgentDepsT]]: with self._dbos_overrides(): return super().toolsets @contextmanager def _dbos_overrides(self) -> Iterator[None]: # Override with DBOSModel and DBOSMCPServer in the toolsets. with ( super().override(model=self._model, toolsets=self._toolsets, tools=[]), self.sequential_tool_calls(), ): yield @overload async def run( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[OutputDataT]: ... @overload async def run( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[RunOutputDataT]: ... async def run( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AgentRunResult[Any]: """Run the agent with a user prompt in async mode. This method builds an internal agent graph (using system prompts, tools and result schemas) and then runs the graph to completion. The result of the run is returned. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): agent_run = await agent.run('What is the capital of France?') print(agent_run.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. event_stream_handler: Optional event stream handler to use for this run. Returns: The result of the run. """ return await self.dbos_wrapped_run_workflow( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, event_stream_handler=event_stream_handler, **_deprecated_kwargs, ) @overload def run_sync( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[OutputDataT]: ... @overload def run_sync( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[RunOutputDataT]: ... def run_sync( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AgentRunResult[Any]: """Synchronously run the agent with a user prompt. This is a convenience method that wraps [`self.run`][pydantic_ai.agent.AbstractAgent.run] with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') result_sync = agent.run_sync('What is the capital of Italy?') print(result_sync.output) #> The capital of Italy is Rome. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. event_stream_handler: Optional event stream handler to use for this run. Returns: The result of the run. """ return self.dbos_wrapped_run_sync_workflow( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, event_stream_handler=event_stream_handler, **_deprecated_kwargs, ) @overload def run_stream( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AbstractAsyncContextManager[StreamedRunResult[AgentDepsT, OutputDataT]]: ... @overload def run_stream( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AbstractAsyncContextManager[StreamedRunResult[AgentDepsT, RunOutputDataT]]: ... @asynccontextmanager async def run_stream( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AsyncIterator[StreamedRunResult[AgentDepsT, Any]]: """Run the agent with a user prompt in async mode, returning a streamed response. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): async with agent.run_stream('What is the capital of the UK?') as response: print(await response.get_output()) #> The capital of the UK is London. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. event_stream_handler: Optional event stream handler to use for this run. It will receive all the events up until the final result is found, which you can then read or stream from inside the context manager. Returns: The result of the run. """ if DBOS.workflow_id is not None and DBOS.step_id is None: raise UserError( '`agent.run_stream()` cannot currently be used inside a DBOS workflow. ' 'Set an `event_stream_handler` on the agent and use `agent.run()` instead. ' 'Please file an issue if this is not sufficient for your use case.' ) async with super().run_stream( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, event_stream_handler=event_stream_handler, **_deprecated_kwargs, ) as result: yield result @overload def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, **_deprecated_kwargs: Never, ) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, OutputDataT]]: ... @overload def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, **_deprecated_kwargs: Never, ) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, RunOutputDataT]]: ... @asynccontextmanager async def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, **_deprecated_kwargs: Never, ) -> AsyncIterator[AgentRun[AgentDepsT, Any]]: """A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions=None, instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. Returns: The result of the run. """ if model is not None and not isinstance(model, DBOSModel): raise UserError( 'Non-DBOS model cannot be set at agent run time inside a DBOS workflow, it must be set at agent creation time.' ) with self._dbos_overrides(): async with super().iter( user_prompt=user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, **_deprecated_kwargs, ) as run: yield run @contextmanager def override( self, *, deps: AgentDepsT | _utils.Unset = _utils.UNSET, model: models.Model | models.KnownModelName | str | _utils.Unset = _utils.UNSET, toolsets: Sequence[AbstractToolset[AgentDepsT]] | _utils.Unset = _utils.UNSET, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | _utils.Unset = _utils.UNSET, ) -> Iterator[None]: """Context manager to temporarily override agent dependencies, model, toolsets, or tools. This is particularly useful when testing. You can find an example of this [here](../testing.md#overriding-model-via-pytest-fixtures). Args: deps: The dependencies to use instead of the dependencies passed to the agent run. model: The model to use instead of the model passed to the agent run. toolsets: The toolsets to use instead of the toolsets passed to the agent constructor and agent run. tools: The tools to use instead of the tools registered with the agent. """ if _utils.is_set(model) and not isinstance(model, (DBOSModel)): raise UserError( 'Non-DBOS model cannot be contextually overridden inside a DBOS workflow, it must be set at agent creation time.' ) with super().override(deps=deps, model=model, toolsets=toolsets, tools=tools): yield ```` #### __init__ ```python __init__( wrapped: AbstractAgent[AgentDepsT, OutputDataT], *, name: str | None = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None, mcp_step_config: StepConfig | None = None, model_step_config: StepConfig | None = None ) ``` Wrap an agent to enable it with DBOS durable workflows, by automatically offloading model requests, tool calls, and MCP server communication to DBOS steps. After wrapping, the original agent can still be used as normal outside of the DBOS workflow. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `wrapped` | `AbstractAgent[AgentDepsT, OutputDataT]` | The agent to wrap. | *required* | | `name` | `str | None` | Optional unique agent name to use as the DBOS configured instance name. If not provided, the agent's name will be used. | `None` | | `event_stream_handler` | `EventStreamHandler[AgentDepsT] | None` | Optional event stream handler to use instead of the one set on the wrapped agent. | `None` | | `mcp_step_config` | `StepConfig | None` | The base DBOS step config to use for MCP server steps. If no config is provided, use the default settings of DBOS. | `None` | | `model_step_config` | `StepConfig | None` | The DBOS step config to use for model request steps. If no config is provided, use the default settings of DBOS. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/dbos/_agent.py` ```python def __init__( self, wrapped: AbstractAgent[AgentDepsT, OutputDataT], *, name: str | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, mcp_step_config: StepConfig | None = None, model_step_config: StepConfig | None = None, ): """Wrap an agent to enable it with DBOS durable workflows, by automatically offloading model requests, tool calls, and MCP server communication to DBOS steps. After wrapping, the original agent can still be used as normal outside of the DBOS workflow. Args: wrapped: The agent to wrap. name: Optional unique agent name to use as the DBOS configured instance name. If not provided, the agent's `name` will be used. event_stream_handler: Optional event stream handler to use instead of the one set on the wrapped agent. mcp_step_config: The base DBOS step config to use for MCP server steps. If no config is provided, use the default settings of DBOS. model_step_config: The DBOS step config to use for model request steps. If no config is provided, use the default settings of DBOS. """ super().__init__(wrapped) self._name = name or wrapped.name self._event_stream_handler = event_stream_handler if self._name is None: raise UserError( "An agent needs to have a unique `name` in order to be used with DBOS. The name will be used to identify the agent's workflows and steps." ) # Merge the config with the default DBOS config self._mcp_step_config = mcp_step_config or {} self._model_step_config = model_step_config or {} if not isinstance(wrapped.model, Model): raise UserError( 'An agent needs to have a `model` in order to be used with DBOS, it cannot be set at agent run time.' ) dbos_model = DBOSModel( wrapped.model, step_name_prefix=self._name, step_config=self._model_step_config, event_stream_handler=self.event_stream_handler, ) self._model = dbos_model dbosagent_name = self._name def dbosify_toolset(toolset: AbstractToolset[AgentDepsT]) -> AbstractToolset[AgentDepsT]: # Replace MCPServer with DBOSMCPServer if isinstance(toolset, MCPServer): return DBOSMCPServer( wrapped=toolset, step_name_prefix=dbosagent_name, step_config=self._mcp_step_config, ) else: return toolset dbos_toolsets = [toolset.visit_and_replace(dbosify_toolset) for toolset in wrapped.toolsets] self._toolsets = dbos_toolsets DBOSConfiguredInstance.__init__(self, self._name) # Wrap the `run` method in a DBOS workflow @DBOS.workflow(name=f'{self._name}.run') async def wrapped_run_workflow( user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AgentRunResult[Any]: with self._dbos_overrides(): return await super(WrapperAgent, self).run( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, event_stream_handler=event_stream_handler, **_deprecated_kwargs, ) self.dbos_wrapped_run_workflow = wrapped_run_workflow # Wrap the `run_sync` method in a DBOS workflow @DBOS.workflow(name=f'{self._name}.run_sync') def wrapped_run_sync_workflow( user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AgentRunResult[Any]: with self._dbos_overrides(): return super(DBOSAgent, self).run_sync( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, event_stream_handler=event_stream_handler, **_deprecated_kwargs, ) self.dbos_wrapped_run_sync_workflow = wrapped_run_sync_workflow ``` #### run ```python run( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: list[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AgentRunResult[OutputDataT] ``` ```python run( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: list[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AgentRunResult[RunOutputDataT] ``` ```python run( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: list[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None, **_deprecated_kwargs: Never ) -> AgentRunResult[Any] ``` Run the agent with a user prompt in async mode. This method builds an internal agent graph (using system prompts, tools and result schemas) and then runs the graph to completion. The result of the run is returned. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): agent_run = await agent.run('What is the capital of France?') print(agent_run.output) #> The capital of France is Paris. ``` Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `user_prompt` | `str | Sequence[UserContent] | None` | User input to start/continue the conversation. | `None` | | `output_type` | `OutputSpec[RunOutputDataT] | None` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | `None` | | `message_history` | `list[ModelMessage] | None` | History of the conversation so far. | `None` | | `deferred_tool_results` | `DeferredToolResults | None` | Optional results for deferred tool calls in the message history. | `None` | | `model` | `Model | KnownModelName | str | None` | Optional model to use for this run, required if model was not set when creating the agent. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | `ModelSettings | None` | Optional settings to use for this model's request. | `None` | | `usage_limits` | `UsageLimits | None` | Optional limits on model request count or token usage. | `None` | | `usage` | `RunUsage | None` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | `None` | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | None` | Optional additional toolsets for this run. | `None` | | `event_stream_handler` | `EventStreamHandler[AgentDepsT] | None` | Optional event stream handler to use for this run. | `None` | Returns: | Type | Description | | --- | --- | | `AgentRunResult[Any]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/dbos/_agent.py` ````python async def run( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AgentRunResult[Any]: """Run the agent with a user prompt in async mode. This method builds an internal agent graph (using system prompts, tools and result schemas) and then runs the graph to completion. The result of the run is returned. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): agent_run = await agent.run('What is the capital of France?') print(agent_run.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. event_stream_handler: Optional event stream handler to use for this run. Returns: The result of the run. """ return await self.dbos_wrapped_run_workflow( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, event_stream_handler=event_stream_handler, **_deprecated_kwargs, ) ```` #### run_sync ```python run_sync( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: list[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AgentRunResult[OutputDataT] ``` ```python run_sync( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: list[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AgentRunResult[RunOutputDataT] ``` ```python run_sync( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: list[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None, **_deprecated_kwargs: Never ) -> AgentRunResult[Any] ``` Synchronously run the agent with a user prompt. This is a convenience method that wraps self.run with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') result_sync = agent.run_sync('What is the capital of Italy?') print(result_sync.output) #> The capital of Italy is Rome. ``` Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `user_prompt` | `str | Sequence[UserContent] | None` | User input to start/continue the conversation. | `None` | | `output_type` | `OutputSpec[RunOutputDataT] | None` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | `None` | | `message_history` | `list[ModelMessage] | None` | History of the conversation so far. | `None` | | `deferred_tool_results` | `DeferredToolResults | None` | Optional results for deferred tool calls in the message history. | `None` | | `model` | `Model | KnownModelName | str | None` | Optional model to use for this run, required if model was not set when creating the agent. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | `ModelSettings | None` | Optional settings to use for this model's request. | `None` | | `usage_limits` | `UsageLimits | None` | Optional limits on model request count or token usage. | `None` | | `usage` | `RunUsage | None` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | `None` | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | None` | Optional additional toolsets for this run. | `None` | | `event_stream_handler` | `EventStreamHandler[AgentDepsT] | None` | Optional event stream handler to use for this run. | `None` | Returns: | Type | Description | | --- | --- | | `AgentRunResult[Any]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/dbos/_agent.py` ````python def run_sync( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AgentRunResult[Any]: """Synchronously run the agent with a user prompt. This is a convenience method that wraps [`self.run`][pydantic_ai.agent.AbstractAgent.run] with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') result_sync = agent.run_sync('What is the capital of Italy?') print(result_sync.output) #> The capital of Italy is Rome. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. event_stream_handler: Optional event stream handler to use for this run. Returns: The result of the run. """ return self.dbos_wrapped_run_sync_workflow( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, event_stream_handler=event_stream_handler, **_deprecated_kwargs, ) ```` #### run_stream ```python run_stream( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: list[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AbstractAsyncContextManager[ StreamedRunResult[AgentDepsT, OutputDataT] ] ``` ```python run_stream( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: list[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AbstractAsyncContextManager[ StreamedRunResult[AgentDepsT, RunOutputDataT] ] ``` ```python run_stream( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: list[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None, **_deprecated_kwargs: Never ) -> AsyncIterator[StreamedRunResult[AgentDepsT, Any]] ``` Run the agent with a user prompt in async mode, returning a streamed response. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): async with agent.run_stream('What is the capital of the UK?') as response: print(await response.get_output()) #> The capital of the UK is London. ``` Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `user_prompt` | `str | Sequence[UserContent] | None` | User input to start/continue the conversation. | `None` | | `output_type` | `OutputSpec[RunOutputDataT] | None` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | `None` | | `message_history` | `list[ModelMessage] | None` | History of the conversation so far. | `None` | | `deferred_tool_results` | `DeferredToolResults | None` | Optional results for deferred tool calls in the message history. | `None` | | `model` | `Model | KnownModelName | str | None` | Optional model to use for this run, required if model was not set when creating the agent. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | `ModelSettings | None` | Optional settings to use for this model's request. | `None` | | `usage_limits` | `UsageLimits | None` | Optional limits on model request count or token usage. | `None` | | `usage` | `RunUsage | None` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | `None` | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | None` | Optional additional toolsets for this run. | `None` | | `event_stream_handler` | `EventStreamHandler[AgentDepsT] | None` | Optional event stream handler to use for this run. It will receive all the events up until the final result is found, which you can then read or stream from inside the context manager. | `None` | Returns: | Type | Description | | --- | --- | | `AsyncIterator[StreamedRunResult[AgentDepsT, Any]]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/dbos/_agent.py` ````python @asynccontextmanager async def run_stream( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AsyncIterator[StreamedRunResult[AgentDepsT, Any]]: """Run the agent with a user prompt in async mode, returning a streamed response. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): async with agent.run_stream('What is the capital of the UK?') as response: print(await response.get_output()) #> The capital of the UK is London. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. event_stream_handler: Optional event stream handler to use for this run. It will receive all the events up until the final result is found, which you can then read or stream from inside the context manager. Returns: The result of the run. """ if DBOS.workflow_id is not None and DBOS.step_id is None: raise UserError( '`agent.run_stream()` cannot currently be used inside a DBOS workflow. ' 'Set an `event_stream_handler` on the agent and use `agent.run()` instead. ' 'Please file an issue if this is not sufficient for your use case.' ) async with super().run_stream( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, event_stream_handler=event_stream_handler, **_deprecated_kwargs, ) as result: yield result ```` #### iter ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: list[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, **_deprecated_kwargs: Never ) -> AbstractAsyncContextManager[ AgentRun[AgentDepsT, OutputDataT] ] ``` ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: list[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, **_deprecated_kwargs: Never ) -> AbstractAsyncContextManager[ AgentRun[AgentDepsT, RunOutputDataT] ] ``` ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: list[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, **_deprecated_kwargs: Never ) -> AsyncIterator[AgentRun[AgentDepsT, Any]] ``` A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions=None, instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `user_prompt` | `str | Sequence[UserContent] | None` | User input to start/continue the conversation. | `None` | | `output_type` | `OutputSpec[RunOutputDataT] | None` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | `None` | | `message_history` | `list[ModelMessage] | None` | History of the conversation so far. | `None` | | `deferred_tool_results` | `DeferredToolResults | None` | Optional results for deferred tool calls in the message history. | `None` | | `model` | `Model | KnownModelName | str | None` | Optional model to use for this run, required if model was not set when creating the agent. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | `ModelSettings | None` | Optional settings to use for this model's request. | `None` | | `usage_limits` | `UsageLimits | None` | Optional limits on model request count or token usage. | `None` | | `usage` | `RunUsage | None` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | `None` | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | None` | Optional additional toolsets for this run. | `None` | Returns: | Type | Description | | --- | --- | | `AsyncIterator[AgentRun[AgentDepsT, Any]]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/dbos/_agent.py` ````python @asynccontextmanager async def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: list[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, **_deprecated_kwargs: Never, ) -> AsyncIterator[AgentRun[AgentDepsT, Any]]: """A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions=None, instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. Returns: The result of the run. """ if model is not None and not isinstance(model, DBOSModel): raise UserError( 'Non-DBOS model cannot be set at agent run time inside a DBOS workflow, it must be set at agent creation time.' ) with self._dbos_overrides(): async with super().iter( user_prompt=user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, **_deprecated_kwargs, ) as run: yield run ```` #### override ```python override( *, deps: AgentDepsT | Unset = UNSET, model: Model | KnownModelName | str | Unset = UNSET, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | Unset ) = UNSET, tools: ( Sequence[ Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...] ] | Unset ) = UNSET ) -> Iterator[None] ``` Context manager to temporarily override agent dependencies, model, toolsets, or tools. This is particularly useful when testing. You can find an example of this [here](../../testing/#overriding-model-via-pytest-fixtures). Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `deps` | `AgentDepsT | Unset` | The dependencies to use instead of the dependencies passed to the agent run. | `UNSET` | | `model` | `Model | KnownModelName | str | Unset` | The model to use instead of the model passed to the agent run. | `UNSET` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | Unset` | The toolsets to use instead of the toolsets passed to the agent constructor and agent run. | `UNSET` | | `tools` | `Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | Unset` | The tools to use instead of the tools registered with the agent. | `UNSET` | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/dbos/_agent.py` ```python @contextmanager def override( self, *, deps: AgentDepsT | _utils.Unset = _utils.UNSET, model: models.Model | models.KnownModelName | str | _utils.Unset = _utils.UNSET, toolsets: Sequence[AbstractToolset[AgentDepsT]] | _utils.Unset = _utils.UNSET, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | _utils.Unset = _utils.UNSET, ) -> Iterator[None]: """Context manager to temporarily override agent dependencies, model, toolsets, or tools. This is particularly useful when testing. You can find an example of this [here](../testing.md#overriding-model-via-pytest-fixtures). Args: deps: The dependencies to use instead of the dependencies passed to the agent run. model: The model to use instead of the model passed to the agent run. toolsets: The toolsets to use instead of the toolsets passed to the agent constructor and agent run. tools: The tools to use instead of the tools registered with the agent. """ if _utils.is_set(model) and not isinstance(model, (DBOSModel)): raise UserError( 'Non-DBOS model cannot be contextually overridden inside a DBOS workflow, it must be set at agent creation time.' ) with super().override(deps=deps, model=model, toolsets=toolsets, tools=tools): yield ``` ### DBOSMCPServer Bases: `WrapperToolset[AgentDepsT]`, `ABC` A wrapper for MCPServer that integrates with DBOS, turning call_tool and get_tools to DBOS steps. Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/dbos/_mcp_server.py` ```python class DBOSMCPServer(WrapperToolset[AgentDepsT], ABC): """A wrapper for MCPServer that integrates with DBOS, turning call_tool and get_tools to DBOS steps.""" def __init__( self, wrapped: MCPServer, *, step_name_prefix: str, step_config: StepConfig, ): super().__init__(wrapped) self._step_config = step_config or {} self._step_name_prefix = step_name_prefix id_suffix = f'__{wrapped.id}' if wrapped.id else '' self._name = f'{step_name_prefix}__mcp_server{id_suffix}' # Wrap get_tools in a DBOS step. @DBOS.step( name=f'{self._name}.get_tools', **self._step_config, ) async def wrapped_get_tools_step( ctx: RunContext[AgentDepsT], ) -> dict[str, ToolsetTool[AgentDepsT]]: return await super(DBOSMCPServer, self).get_tools(ctx) self._dbos_wrapped_get_tools_step = wrapped_get_tools_step # Wrap call_tool in a DBOS step. @DBOS.step( name=f'{self._name}.call_tool', **self._step_config, ) async def wrapped_call_tool_step( name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT], ) -> ToolResult: return await super(DBOSMCPServer, self).call_tool(name, tool_args, ctx, tool) self._dbos_wrapped_call_tool_step = wrapped_call_tool_step @property def id(self) -> str | None: return self.wrapped.id async def __aenter__(self) -> Self: # The wrapped MCPServer enters itself around listing and calling tools # so we don't need to enter it here (nor could we because we're not inside a DBOS step). return self async def __aexit__(self, *args: Any) -> bool | None: return None def visit_and_replace( self, visitor: Callable[[AbstractToolset[AgentDepsT]], AbstractToolset[AgentDepsT]] ) -> AbstractToolset[AgentDepsT]: # DBOS-ified toolsets cannot be swapped out after the fact. return self async def get_tools(self, ctx: RunContext[AgentDepsT]) -> dict[str, ToolsetTool[AgentDepsT]]: return await self._dbos_wrapped_get_tools_step(ctx) async def call_tool( self, name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT], ) -> ToolResult: return await self._dbos_wrapped_call_tool_step(name, tool_args, ctx, tool) ``` ### DBOSModel Bases: `WrapperModel` A wrapper for Model that integrates with DBOS, turning request and request_stream to DBOS steps. Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/dbos/_model.py` ```python class DBOSModel(WrapperModel): """A wrapper for Model that integrates with DBOS, turning request and request_stream to DBOS steps.""" def __init__( self, model: Model, *, step_name_prefix: str, step_config: StepConfig, event_stream_handler: EventStreamHandler[Any] | None = None, ): super().__init__(model) self.step_config = step_config self.event_stream_handler = event_stream_handler self._step_name_prefix = step_name_prefix # Wrap the request in a DBOS step. @DBOS.step( name=f'{self._step_name_prefix}__model.request', **self.step_config, ) async def wrapped_request_step( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: return await super(DBOSModel, self).request(messages, model_settings, model_request_parameters) self._dbos_wrapped_request_step = wrapped_request_step # Wrap the request_stream in a DBOS step. @DBOS.step( name=f'{self._step_name_prefix}__model.request_stream', **self.step_config, ) async def wrapped_request_stream_step( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> ModelResponse: async with super(DBOSModel, self).request_stream( messages, model_settings, model_request_parameters, run_context ) as streamed_response: if self.event_stream_handler is not None: assert run_context is not None, ( 'A DBOS model cannot be used with `pydantic_ai.direct.model_request_stream()` as it requires a `run_context`. Set an `event_stream_handler` on the agent and use `agent.run()` instead.' ) await self.event_stream_handler(run_context, streamed_response) async for _ in streamed_response: pass return streamed_response.get() self._dbos_wrapped_request_stream_step = wrapped_request_stream_step async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: return await self._dbos_wrapped_request_step(messages, model_settings, model_request_parameters) @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: # If not in a workflow (could be in a step), just call the wrapped request_stream method. if DBOS.workflow_id is None or DBOS.step_id is not None: async with super().request_stream( messages, model_settings, model_request_parameters, run_context ) as streamed_response: yield streamed_response return response = await self._dbos_wrapped_request_stream_step( messages, model_settings, model_request_parameters, run_context ) yield DBOSStreamedResponse(model_request_parameters, response) ``` ### StepConfig Bases: `TypedDict` Configuration for a step in the DBOS workflow. Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/dbos/_utils.py` ```python class StepConfig(TypedDict, total=False): """Configuration for a step in the DBOS workflow.""" retries_allowed: bool interval_seconds: float max_attempts: int backoff_rate: float ``` # `pydantic_ai.exceptions` ### ModelRetry Bases: `Exception` Exception to raise when a tool function should be retried. The agent will return the message to the model and ask it to try calling the function/tool again. Source code in `pydantic_ai_slim/pydantic_ai/exceptions.py` ```python class ModelRetry(Exception): """Exception to raise when a tool function should be retried. The agent will return the message to the model and ask it to try calling the function/tool again. """ message: str """The message to return to the model.""" def __init__(self, message: str): self.message = message super().__init__(message) def __eq__(self, other: Any) -> bool: return isinstance(other, self.__class__) and other.message == self.message @classmethod def __get_pydantic_core_schema__(cls, _: Any, __: Any) -> core_schema.CoreSchema: """Pydantic core schema to allow `ModelRetry` to be (de)serialized.""" schema = core_schema.typed_dict_schema( { 'message': core_schema.typed_dict_field(core_schema.str_schema()), 'kind': core_schema.typed_dict_field(core_schema.literal_schema(['model-retry'])), } ) return core_schema.no_info_after_validator_function( lambda dct: ModelRetry(dct['message']), schema, serialization=core_schema.plain_serializer_function_ser_schema( lambda x: {'message': x.message, 'kind': 'model-retry'}, return_schema=schema, ), ) ``` #### message ```python message: str = message ``` The message to return to the model. #### __get_pydantic_core_schema__ ```python __get_pydantic_core_schema__(_: Any, __: Any) -> CoreSchema ``` Pydantic core schema to allow `ModelRetry` to be (de)serialized. Source code in `pydantic_ai_slim/pydantic_ai/exceptions.py` ```python @classmethod def __get_pydantic_core_schema__(cls, _: Any, __: Any) -> core_schema.CoreSchema: """Pydantic core schema to allow `ModelRetry` to be (de)serialized.""" schema = core_schema.typed_dict_schema( { 'message': core_schema.typed_dict_field(core_schema.str_schema()), 'kind': core_schema.typed_dict_field(core_schema.literal_schema(['model-retry'])), } ) return core_schema.no_info_after_validator_function( lambda dct: ModelRetry(dct['message']), schema, serialization=core_schema.plain_serializer_function_ser_schema( lambda x: {'message': x.message, 'kind': 'model-retry'}, return_schema=schema, ), ) ``` ### CallDeferred Bases: `Exception` Exception to raise when a tool call should be deferred. See [tools docs](../../deferred-tools/#deferred-tools) for more information. Source code in `pydantic_ai_slim/pydantic_ai/exceptions.py` ```python class CallDeferred(Exception): """Exception to raise when a tool call should be deferred. See [tools docs](../deferred-tools.md#deferred-tools) for more information. """ pass ``` ### ApprovalRequired Bases: `Exception` Exception to raise when a tool call requires human-in-the-loop approval. See [tools docs](../../deferred-tools/#human-in-the-loop-tool-approval) for more information. Source code in `pydantic_ai_slim/pydantic_ai/exceptions.py` ```python class ApprovalRequired(Exception): """Exception to raise when a tool call requires human-in-the-loop approval. See [tools docs](../deferred-tools.md#human-in-the-loop-tool-approval) for more information. """ pass ``` ### UserError Bases: `RuntimeError` Error caused by a usage mistake by the application developer — You! Source code in `pydantic_ai_slim/pydantic_ai/exceptions.py` ```python class UserError(RuntimeError): """Error caused by a usage mistake by the application developer — You!""" message: str """Description of the mistake.""" def __init__(self, message: str): self.message = message super().__init__(message) ``` #### message ```python message: str = message ``` Description of the mistake. ### AgentRunError Bases: `RuntimeError` Base class for errors occurring during an agent run. Source code in `pydantic_ai_slim/pydantic_ai/exceptions.py` ```python class AgentRunError(RuntimeError): """Base class for errors occurring during an agent run.""" message: str """The error message.""" def __init__(self, message: str): self.message = message super().__init__(message) def __str__(self) -> str: return self.message ``` #### message ```python message: str = message ``` The error message. ### UsageLimitExceeded Bases: `AgentRunError` Error raised when a Model's usage exceeds the specified limits. Source code in `pydantic_ai_slim/pydantic_ai/exceptions.py` ```python class UsageLimitExceeded(AgentRunError): """Error raised when a Model's usage exceeds the specified limits.""" ``` ### UnexpectedModelBehavior Bases: `AgentRunError` Error caused by unexpected Model behavior, e.g. an unexpected response code. Source code in `pydantic_ai_slim/pydantic_ai/exceptions.py` ```python class UnexpectedModelBehavior(AgentRunError): """Error caused by unexpected Model behavior, e.g. an unexpected response code.""" message: str """Description of the unexpected behavior.""" body: str | None """The body of the response, if available.""" def __init__(self, message: str, body: str | None = None): self.message = message if body is None: self.body: str | None = None else: try: self.body = json.dumps(json.loads(body), indent=2) except ValueError: self.body = body super().__init__(message) def __str__(self) -> str: if self.body: return f'{self.message}, body:\n{self.body}' else: return self.message ``` #### message ```python message: str = message ``` Description of the unexpected behavior. #### body ```python body: str | None = dumps(loads(body), indent=2) ``` The body of the response, if available. ### ModelHTTPError Bases: `AgentRunError` Raised when an model provider response has a status code of 4xx or 5xx. Source code in `pydantic_ai_slim/pydantic_ai/exceptions.py` ```python class ModelHTTPError(AgentRunError): """Raised when an model provider response has a status code of 4xx or 5xx.""" status_code: int """The HTTP status code returned by the API.""" model_name: str """The name of the model associated with the error.""" body: object | None """The body of the response, if available.""" message: str """The error message with the status code and response body, if available.""" def __init__(self, status_code: int, model_name: str, body: object | None = None): self.status_code = status_code self.model_name = model_name self.body = body message = f'status_code: {status_code}, model_name: {model_name}, body: {body}' super().__init__(message) ``` #### message ```python message: str ``` The error message with the status code and response body, if available. #### status_code ```python status_code: int = status_code ``` The HTTP status code returned by the API. #### model_name ```python model_name: str = model_name ``` The name of the model associated with the error. #### body ```python body: object | None = body ``` The body of the response, if available. ### FallbackExceptionGroup Bases: `ExceptionGroup` A group of exceptions that can be raised when all fallback models fail. Source code in `pydantic_ai_slim/pydantic_ai/exceptions.py` ```python class FallbackExceptionGroup(ExceptionGroup): """A group of exceptions that can be raised when all fallback models fail.""" ``` # `pydantic_ai.ext` ### tool_from_langchain ```python tool_from_langchain(langchain_tool: LangChainTool) -> Tool ``` Creates a Pydantic AI tool proxy from a LangChain tool. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `langchain_tool` | `LangChainTool` | The LangChain tool to wrap. | *required* | Returns: | Type | Description | | --- | --- | | `Tool` | A Pydantic AI tool that corresponds to the LangChain tool. | Source code in `pydantic_ai_slim/pydantic_ai/ext/langchain.py` ```python def tool_from_langchain(langchain_tool: LangChainTool) -> Tool: """Creates a Pydantic AI tool proxy from a LangChain tool. Args: langchain_tool: The LangChain tool to wrap. Returns: A Pydantic AI tool that corresponds to the LangChain tool. """ function_name = langchain_tool.name function_description = langchain_tool.description inputs = langchain_tool.args.copy() required = sorted({name for name, detail in inputs.items() if 'default' not in detail}) schema: JsonSchemaValue = langchain_tool.get_input_jsonschema() if 'additionalProperties' not in schema: schema['additionalProperties'] = False if required: schema['required'] = required defaults = {name: detail['default'] for name, detail in inputs.items() if 'default' in detail} # restructures the arguments to match langchain tool run def proxy(*args: Any, **kwargs: Any) -> str: assert not args, 'This should always be called with kwargs' kwargs = defaults | kwargs return langchain_tool.run(kwargs) return Tool.from_schema( function=proxy, name=function_name, description=function_description, json_schema=schema, ) ``` ### LangChainToolset Bases: `FunctionToolset` A toolset that wraps LangChain tools. Source code in `pydantic_ai_slim/pydantic_ai/ext/langchain.py` ```python class LangChainToolset(FunctionToolset): """A toolset that wraps LangChain tools.""" def __init__(self, tools: list[LangChainTool], *, id: str | None = None): super().__init__([tool_from_langchain(tool) for tool in tools], id=id) ``` ### tool_from_aci ```python tool_from_aci( aci_function: str, linked_account_owner_id: str ) -> Tool ``` Creates a Pydantic AI tool proxy from an ACI.dev function. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `aci_function` | `str` | The ACI.dev function to wrap. | *required* | | `linked_account_owner_id` | `str` | The ACI user ID to execute the function on behalf of. | *required* | Returns: | Type | Description | | --- | --- | | `Tool` | A Pydantic AI tool that corresponds to the ACI.dev tool. | Source code in `pydantic_ai_slim/pydantic_ai/ext/aci.py` ```python def tool_from_aci(aci_function: str, linked_account_owner_id: str) -> Tool: """Creates a Pydantic AI tool proxy from an ACI.dev function. Args: aci_function: The ACI.dev function to wrap. linked_account_owner_id: The ACI user ID to execute the function on behalf of. Returns: A Pydantic AI tool that corresponds to the ACI.dev tool. """ aci = ACI() function_definition = aci.functions.get_definition(aci_function) function_name = function_definition['function']['name'] function_description = function_definition['function']['description'] inputs = function_definition['function']['parameters'] json_schema = { 'additionalProperties': inputs.get('additionalProperties', False), 'properties': inputs.get('properties', {}), 'required': inputs.get('required', []), # Default to 'object' if not specified 'type': inputs.get('type', 'object'), } # Clean the schema json_schema = _clean_schema(json_schema) def implementation(*args: Any, **kwargs: Any) -> str: if args: raise TypeError('Positional arguments are not allowed') return aci.handle_function_call( function_name, kwargs, linked_account_owner_id=linked_account_owner_id, allowed_apps_only=True, ) return Tool.from_schema( function=implementation, name=function_name, description=function_description, json_schema=json_schema, ) ``` ### ACIToolset Bases: `FunctionToolset` A toolset that wraps ACI.dev tools. Source code in `pydantic_ai_slim/pydantic_ai/ext/aci.py` ```python class ACIToolset(FunctionToolset): """A toolset that wraps ACI.dev tools.""" def __init__(self, aci_functions: Sequence[str], linked_account_owner_id: str, *, id: str | None = None): super().__init__( [tool_from_aci(aci_function, linked_account_owner_id) for aci_function in aci_functions], id=id ) ``` # `fasta2a` ### FastA2A Bases: `Starlette` The main class for the FastA2A library. Source code in `.venv/lib/python3.12/site-packages/fasta2a/applications.py` ```python class FastA2A(Starlette): """The main class for the FastA2A library.""" def __init__( self, *, storage: Storage, broker: Broker, # Agent card name: str | None = None, url: str = 'http://localhost:8000', version: str = '1.0.0', description: str | None = None, provider: AgentProvider | None = None, skills: list[Skill] | None = None, # Starlette debug: bool = False, routes: Sequence[Route] | None = None, middleware: Sequence[Middleware] | None = None, exception_handlers: dict[Any, ExceptionHandler] | None = None, lifespan: Lifespan[FastA2A] | None = None, ): if lifespan is None: lifespan = _default_lifespan super().__init__( debug=debug, routes=routes, middleware=middleware, exception_handlers=exception_handlers, lifespan=lifespan, ) self.name = name or 'My Agent' self.url = url self.version = version self.description = description self.provider = provider self.skills = skills or [] # NOTE: For now, I don't think there's any reason to support any other input/output modes. self.default_input_modes = ['application/json'] self.default_output_modes = ['application/json'] self.task_manager = TaskManager(broker=broker, storage=storage) # Setup self._agent_card_json_schema: bytes | None = None self.router.add_route('/.well-known/agent.json', self._agent_card_endpoint, methods=['HEAD', 'GET', 'OPTIONS']) self.router.add_route('/', self._agent_run_endpoint, methods=['POST']) self.router.add_route('/docs', self._docs_endpoint, methods=['GET']) async def __call__(self, scope: Scope, receive: Receive, send: Send) -> None: if scope['type'] == 'http' and not self.task_manager.is_running: raise RuntimeError('TaskManager was not properly initialized.') await super().__call__(scope, receive, send) async def _agent_card_endpoint(self, request: Request) -> Response: if self._agent_card_json_schema is None: agent_card = AgentCard( name=self.name, description=self.description or 'An AI agent exposed as an A2A agent.', url=self.url, version=self.version, protocol_version='0.2.5', skills=self.skills, default_input_modes=self.default_input_modes, default_output_modes=self.default_output_modes, capabilities=AgentCapabilities( streaming=False, push_notifications=False, state_transition_history=False ), ) if self.provider is not None: agent_card['provider'] = self.provider self._agent_card_json_schema = agent_card_ta.dump_json(agent_card, by_alias=True) return Response(content=self._agent_card_json_schema, media_type='application/json') async def _docs_endpoint(self, request: Request) -> Response: """Serve the documentation interface.""" docs_path = Path(__file__).parent / 'static' / 'docs.html' return FileResponse(docs_path, media_type='text/html') async def _agent_run_endpoint(self, request: Request) -> Response: """This is the main endpoint for the A2A server. Although the specification allows freedom of choice and implementation, I'm pretty sure about some decisions. 1. The server will always either send a "submitted" or a "failed" on `tasks/send`. Never a "completed" on the first message. 2. There are three possible ends for the task: 2.1. The task was "completed" successfully. 2.2. The task was "canceled". 2.3. The task "failed". 3. The server will send a "working" on the first chunk on `tasks/pushNotification/get`. """ data = await request.body() a2a_request = a2a_request_ta.validate_json(data) if a2a_request['method'] == 'message/send': jsonrpc_response = await self.task_manager.send_message(a2a_request) elif a2a_request['method'] == 'tasks/get': jsonrpc_response = await self.task_manager.get_task(a2a_request) elif a2a_request['method'] == 'tasks/cancel': jsonrpc_response = await self.task_manager.cancel_task(a2a_request) else: raise NotImplementedError(f'Method {a2a_request["method"]} not implemented.') return Response( content=a2a_response_ta.dump_json(jsonrpc_response, by_alias=True), media_type='application/json' ) ``` ### Broker Bases: `ABC` The broker class is in charge of scheduling the tasks. The HTTP server uses the broker to schedule tasks. The simple implementation is the `InMemoryBroker`, which is the broker that runs the tasks in the same process as the HTTP server. That said, this class can be extended to support remote workers. Source code in `.venv/lib/python3.12/site-packages/fasta2a/broker.py` ```python @dataclass class Broker(ABC): """The broker class is in charge of scheduling the tasks. The HTTP server uses the broker to schedule tasks. The simple implementation is the `InMemoryBroker`, which is the broker that runs the tasks in the same process as the HTTP server. That said, this class can be extended to support remote workers. """ @abstractmethod async def run_task(self, params: TaskSendParams) -> None: """Send a task to be executed by the worker.""" raise NotImplementedError('send_run_task is not implemented yet.') @abstractmethod async def cancel_task(self, params: TaskIdParams) -> None: """Cancel a task.""" raise NotImplementedError('send_cancel_task is not implemented yet.') @abstractmethod async def __aenter__(self) -> Self: ... @abstractmethod async def __aexit__(self, exc_type: Any, exc_value: Any, traceback: Any): ... @abstractmethod def receive_task_operations(self) -> AsyncIterator[TaskOperation]: """Receive task operations from the broker. On a multi-worker setup, the broker will need to round-robin the task operations between the workers. """ ``` #### run_task ```python run_task(params: TaskSendParams) -> None ``` Send a task to be executed by the worker. Source code in `.venv/lib/python3.12/site-packages/fasta2a/broker.py` ```python @abstractmethod async def run_task(self, params: TaskSendParams) -> None: """Send a task to be executed by the worker.""" raise NotImplementedError('send_run_task is not implemented yet.') ``` #### cancel_task ```python cancel_task(params: TaskIdParams) -> None ``` Cancel a task. Source code in `.venv/lib/python3.12/site-packages/fasta2a/broker.py` ```python @abstractmethod async def cancel_task(self, params: TaskIdParams) -> None: """Cancel a task.""" raise NotImplementedError('send_cancel_task is not implemented yet.') ``` #### receive_task_operations ```python receive_task_operations() -> AsyncIterator[TaskOperation] ``` Receive task operations from the broker. On a multi-worker setup, the broker will need to round-robin the task operations between the workers. Source code in `.venv/lib/python3.12/site-packages/fasta2a/broker.py` ```python @abstractmethod def receive_task_operations(self) -> AsyncIterator[TaskOperation]: """Receive task operations from the broker. On a multi-worker setup, the broker will need to round-robin the task operations between the workers. """ ``` ### Skill Bases: `TypedDict` Skills are a unit of capability that an agent can perform. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class Skill(TypedDict): """Skills are a unit of capability that an agent can perform.""" id: str """A unique identifier for the skill.""" name: str """Human readable name of the skill.""" description: str """A human-readable description of the skill. It will be used by the client or a human as a hint to understand the skill. """ tags: list[str] """Set of tag-words describing classes of capabilities for this specific skill. Examples: "cooking", "customer support", "billing". """ examples: NotRequired[list[str]] """The set of example scenarios that the skill can perform. Will be used by the client as a hint to understand how the skill can be used. (e.g. "I need a recipe for bread") """ input_modes: list[str] """Supported mime types for input data.""" output_modes: list[str] """Supported mime types for output data.""" ``` #### id ```python id: str ``` A unique identifier for the skill. #### name ```python name: str ``` Human readable name of the skill. #### description ```python description: str ``` A human-readable description of the skill. It will be used by the client or a human as a hint to understand the skill. #### tags ```python tags: list[str] ``` Set of tag-words describing classes of capabilities for this specific skill. Examples: "cooking", "customer support", "billing". #### examples ```python examples: NotRequired[list[str]] ``` The set of example scenarios that the skill can perform. Will be used by the client as a hint to understand how the skill can be used. (e.g. "I need a recipe for bread") #### input_modes ```python input_modes: list[str] ``` Supported mime types for input data. #### output_modes ```python output_modes: list[str] ``` Supported mime types for output data. ### Storage Bases: `ABC`, `Generic[ContextT]` A storage to retrieve and save tasks, as well as retrieve and save context. The storage serves two purposes: 1. Task storage: Stores tasks in A2A protocol format with their status, artifacts, and message history 1. Context storage: Stores conversation context in a format optimized for the specific agent implementation Source code in `.venv/lib/python3.12/site-packages/fasta2a/storage.py` ```python class Storage(ABC, Generic[ContextT]): """A storage to retrieve and save tasks, as well as retrieve and save context. The storage serves two purposes: 1. Task storage: Stores tasks in A2A protocol format with their status, artifacts, and message history 2. Context storage: Stores conversation context in a format optimized for the specific agent implementation """ @abstractmethod async def load_task(self, task_id: str, history_length: int | None = None) -> Task | None: """Load a task from storage. If the task is not found, return None. """ @abstractmethod async def submit_task(self, context_id: str, message: Message) -> Task: """Submit a task to storage.""" @abstractmethod async def update_task( self, task_id: str, state: TaskState, new_artifacts: list[Artifact] | None = None, new_messages: list[Message] | None = None, ) -> Task: """Update the state of a task. Appends artifacts and messages, if specified.""" @abstractmethod async def load_context(self, context_id: str) -> ContextT | None: """Retrieve the stored context given the `context_id`.""" @abstractmethod async def update_context(self, context_id: str, context: ContextT) -> None: """Updates the context for a `context_id`. Implementing agent can decide what to store in context. """ ``` #### load_task ```python load_task( task_id: str, history_length: int | None = None ) -> Task | None ``` Load a task from storage. If the task is not found, return None. Source code in `.venv/lib/python3.12/site-packages/fasta2a/storage.py` ```python @abstractmethod async def load_task(self, task_id: str, history_length: int | None = None) -> Task | None: """Load a task from storage. If the task is not found, return None. """ ``` #### submit_task ```python submit_task(context_id: str, message: Message) -> Task ``` Submit a task to storage. Source code in `.venv/lib/python3.12/site-packages/fasta2a/storage.py` ```python @abstractmethod async def submit_task(self, context_id: str, message: Message) -> Task: """Submit a task to storage.""" ``` #### update_task ```python update_task( task_id: str, state: TaskState, new_artifacts: list[Artifact] | None = None, new_messages: list[Message] | None = None, ) -> Task ``` Update the state of a task. Appends artifacts and messages, if specified. Source code in `.venv/lib/python3.12/site-packages/fasta2a/storage.py` ```python @abstractmethod async def update_task( self, task_id: str, state: TaskState, new_artifacts: list[Artifact] | None = None, new_messages: list[Message] | None = None, ) -> Task: """Update the state of a task. Appends artifacts and messages, if specified.""" ``` #### load_context ```python load_context(context_id: str) -> ContextT | None ``` Retrieve the stored context given the `context_id`. Source code in `.venv/lib/python3.12/site-packages/fasta2a/storage.py` ```python @abstractmethod async def load_context(self, context_id: str) -> ContextT | None: """Retrieve the stored context given the `context_id`.""" ``` #### update_context ```python update_context(context_id: str, context: ContextT) -> None ``` Updates the context for a `context_id`. Implementing agent can decide what to store in context. Source code in `.venv/lib/python3.12/site-packages/fasta2a/storage.py` ```python @abstractmethod async def update_context(self, context_id: str, context: ContextT) -> None: """Updates the context for a `context_id`. Implementing agent can decide what to store in context. """ ``` ### Worker Bases: `ABC`, `Generic[ContextT]` A worker is responsible for executing tasks. Source code in `.venv/lib/python3.12/site-packages/fasta2a/worker.py` ```python @dataclass class Worker(ABC, Generic[ContextT]): """A worker is responsible for executing tasks.""" broker: Broker storage: Storage[ContextT] @asynccontextmanager async def run(self) -> AsyncIterator[None]: """Run the worker. It connects to the broker, and it makes itself available to receive commands. """ async with anyio.create_task_group() as tg: tg.start_soon(self._loop) yield tg.cancel_scope.cancel() async def _loop(self) -> None: async for task_operation in self.broker.receive_task_operations(): await self._handle_task_operation(task_operation) async def _handle_task_operation(self, task_operation: TaskOperation) -> None: try: with use_span(task_operation['_current_span']): with tracer.start_as_current_span( f'{task_operation["operation"]} task', attributes={'logfire.tags': ['fasta2a']} ): if task_operation['operation'] == 'run': await self.run_task(task_operation['params']) elif task_operation['operation'] == 'cancel': await self.cancel_task(task_operation['params']) else: assert_never(task_operation) except Exception: await self.storage.update_task(task_operation['params']['id'], state='failed') @abstractmethod async def run_task(self, params: TaskSendParams) -> None: ... @abstractmethod async def cancel_task(self, params: TaskIdParams) -> None: ... @abstractmethod def build_message_history(self, history: list[Message]) -> list[Any]: ... @abstractmethod def build_artifacts(self, result: Any) -> list[Artifact]: ... ``` #### run ```python run() -> AsyncIterator[None] ``` Run the worker. It connects to the broker, and it makes itself available to receive commands. Source code in `.venv/lib/python3.12/site-packages/fasta2a/worker.py` ```python @asynccontextmanager async def run(self) -> AsyncIterator[None]: """Run the worker. It connects to the broker, and it makes itself available to receive commands. """ async with anyio.create_task_group() as tg: tg.start_soon(self._loop) yield tg.cancel_scope.cancel() ``` This module contains the schema for the agent card. ### AgentCard Bases: `TypedDict` The card that describes an agent. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class AgentCard(TypedDict): """The card that describes an agent.""" name: str """Human readable name of the agent e.g. "Recipe Agent".""" description: str """A human-readable description of the agent. Used to assist users and other agents in understanding what the agent can do. (e.g. "Agent that helps users with recipes and cooking.") """ url: str """A URL to the address the agent is hosted at.""" version: str """The version of the agent - format is up to the provider. (e.g. "1.0.0")""" protocol_version: str """The version of the A2A protocol this agent supports.""" provider: NotRequired[AgentProvider] """The service provider of the agent.""" documentation_url: NotRequired[str] """A URL to documentation for the agent.""" icon_url: NotRequired[str] """A URL to an icon for the agent.""" preferred_transport: NotRequired[str] """The transport of the preferred endpoint. If empty, defaults to JSONRPC.""" additional_interfaces: NotRequired[list[AgentInterface]] """Announcement of additional supported transports.""" capabilities: AgentCapabilities """The capabilities of the agent.""" security: NotRequired[list[dict[str, list[str]]]] """Security requirements for contacting the agent.""" security_schemes: NotRequired[dict[str, SecurityScheme]] """Security scheme definitions.""" default_input_modes: list[str] """Supported mime types for input data.""" default_output_modes: list[str] """Supported mime types for output data.""" skills: list[Skill] ``` #### name ```python name: str ``` Human readable name of the agent e.g. "Recipe Agent". #### description ```python description: str ``` A human-readable description of the agent. Used to assist users and other agents in understanding what the agent can do. (e.g. "Agent that helps users with recipes and cooking.") #### url ```python url: str ``` A URL to the address the agent is hosted at. #### version ```python version: str ``` The version of the agent - format is up to the provider. (e.g. "1.0.0") #### protocol_version ```python protocol_version: str ``` The version of the A2A protocol this agent supports. #### provider ```python provider: NotRequired[AgentProvider] ``` The service provider of the agent. #### documentation_url ```python documentation_url: NotRequired[str] ``` A URL to documentation for the agent. #### icon_url ```python icon_url: NotRequired[str] ``` A URL to an icon for the agent. #### preferred_transport ```python preferred_transport: NotRequired[str] ``` The transport of the preferred endpoint. If empty, defaults to JSONRPC. #### additional_interfaces ```python additional_interfaces: NotRequired[list[AgentInterface]] ``` Announcement of additional supported transports. #### capabilities ```python capabilities: AgentCapabilities ``` The capabilities of the agent. #### security ```python security: NotRequired[list[dict[str, list[str]]]] ``` Security requirements for contacting the agent. #### security_schemes ```python security_schemes: NotRequired[dict[str, SecurityScheme]] ``` Security scheme definitions. #### default_input_modes ```python default_input_modes: list[str] ``` Supported mime types for input data. #### default_output_modes ```python default_output_modes: list[str] ``` Supported mime types for output data. ### AgentProvider Bases: `TypedDict` The service provider of the agent. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python class AgentProvider(TypedDict): """The service provider of the agent.""" organization: str url: str ``` ### AgentCapabilities Bases: `TypedDict` The capabilities of the agent. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class AgentCapabilities(TypedDict): """The capabilities of the agent.""" streaming: NotRequired[bool] """Whether the agent supports streaming.""" push_notifications: NotRequired[bool] """Whether the agent can notify updates to client.""" state_transition_history: NotRequired[bool] """Whether the agent exposes status change history for tasks.""" ``` #### streaming ```python streaming: NotRequired[bool] ``` Whether the agent supports streaming. #### push_notifications ```python push_notifications: NotRequired[bool] ``` Whether the agent can notify updates to client. #### state_transition_history ```python state_transition_history: NotRequired[bool] ``` Whether the agent exposes status change history for tasks. ### HttpSecurityScheme Bases: `TypedDict` HTTP security scheme. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class HttpSecurityScheme(TypedDict): """HTTP security scheme.""" type: Literal['http'] scheme: str """The name of the HTTP Authorization scheme.""" bearer_format: NotRequired[str] """A hint to the client to identify how the bearer token is formatted.""" description: NotRequired[str] """Description of this security scheme.""" ``` #### scheme ```python scheme: str ``` The name of the HTTP Authorization scheme. #### bearer_format ```python bearer_format: NotRequired[str] ``` A hint to the client to identify how the bearer token is formatted. #### description ```python description: NotRequired[str] ``` Description of this security scheme. ### ApiKeySecurityScheme Bases: `TypedDict` API Key security scheme. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class ApiKeySecurityScheme(TypedDict): """API Key security scheme.""" type: Literal['apiKey'] name: str """The name of the header, query or cookie parameter to be used.""" in_: Literal['query', 'header', 'cookie'] """The location of the API key.""" description: NotRequired[str] """Description of this security scheme.""" ``` #### name ```python name: str ``` The name of the header, query or cookie parameter to be used. #### in\_ ```python in_: Literal['query', 'header', 'cookie'] ``` The location of the API key. #### description ```python description: NotRequired[str] ``` Description of this security scheme. ### OAuth2SecurityScheme Bases: `TypedDict` OAuth2 security scheme. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class OAuth2SecurityScheme(TypedDict): """OAuth2 security scheme.""" type: Literal['oauth2'] flows: dict[str, Any] """An object containing configuration information for the flow types supported.""" description: NotRequired[str] """Description of this security scheme.""" ``` #### flows ```python flows: dict[str, Any] ``` An object containing configuration information for the flow types supported. #### description ```python description: NotRequired[str] ``` Description of this security scheme. ### OpenIdConnectSecurityScheme Bases: `TypedDict` OpenID Connect security scheme. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class OpenIdConnectSecurityScheme(TypedDict): """OpenID Connect security scheme.""" type: Literal['openIdConnect'] open_id_connect_url: str """OpenId Connect URL to discover OAuth2 configuration values.""" description: NotRequired[str] """Description of this security scheme.""" ``` #### open_id_connect_url ```python open_id_connect_url: str ``` OpenId Connect URL to discover OAuth2 configuration values. #### description ```python description: NotRequired[str] ``` Description of this security scheme. ### SecurityScheme ```python SecurityScheme = Annotated[ Union[ HttpSecurityScheme, ApiKeySecurityScheme, OAuth2SecurityScheme, OpenIdConnectSecurityScheme, ], Field(discriminator="type"), ] ``` A security scheme for authentication. ### AgentInterface Bases: `TypedDict` An interface that the agent supports. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class AgentInterface(TypedDict): """An interface that the agent supports.""" transport: str """The transport protocol (e.g., 'jsonrpc', 'websocket').""" url: str """The URL endpoint for this transport.""" description: NotRequired[str] """Description of this interface.""" ``` #### transport ```python transport: str ``` The transport protocol (e.g., 'jsonrpc', 'websocket'). #### url ```python url: str ``` The URL endpoint for this transport. #### description ```python description: NotRequired[str] ``` Description of this interface. ### AgentExtension Bases: `TypedDict` A declaration of an extension supported by an Agent. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class AgentExtension(TypedDict): """A declaration of an extension supported by an Agent.""" uri: str """The URI of the extension.""" description: NotRequired[str] """A description of how this agent uses this extension.""" required: NotRequired[bool] """Whether the client must follow specific requirements of the extension.""" params: NotRequired[dict[str, Any]] """Optional configuration for the extension.""" ``` #### uri ```python uri: str ``` The URI of the extension. #### description ```python description: NotRequired[str] ``` A description of how this agent uses this extension. #### required ```python required: NotRequired[bool] ``` Whether the client must follow specific requirements of the extension. #### params ```python params: NotRequired[dict[str, Any]] ``` Optional configuration for the extension. ### Skill Bases: `TypedDict` Skills are a unit of capability that an agent can perform. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class Skill(TypedDict): """Skills are a unit of capability that an agent can perform.""" id: str """A unique identifier for the skill.""" name: str """Human readable name of the skill.""" description: str """A human-readable description of the skill. It will be used by the client or a human as a hint to understand the skill. """ tags: list[str] """Set of tag-words describing classes of capabilities for this specific skill. Examples: "cooking", "customer support", "billing". """ examples: NotRequired[list[str]] """The set of example scenarios that the skill can perform. Will be used by the client as a hint to understand how the skill can be used. (e.g. "I need a recipe for bread") """ input_modes: list[str] """Supported mime types for input data.""" output_modes: list[str] """Supported mime types for output data.""" ``` #### id ```python id: str ``` A unique identifier for the skill. #### name ```python name: str ``` Human readable name of the skill. #### description ```python description: str ``` A human-readable description of the skill. It will be used by the client or a human as a hint to understand the skill. #### tags ```python tags: list[str] ``` Set of tag-words describing classes of capabilities for this specific skill. Examples: "cooking", "customer support", "billing". #### examples ```python examples: NotRequired[list[str]] ``` The set of example scenarios that the skill can perform. Will be used by the client as a hint to understand how the skill can be used. (e.g. "I need a recipe for bread") #### input_modes ```python input_modes: list[str] ``` Supported mime types for input data. #### output_modes ```python output_modes: list[str] ``` Supported mime types for output data. ### Artifact Bases: `TypedDict` Agents generate Artifacts as an end result of a Task. Artifacts are immutable, can be named, and can have multiple parts. A streaming response can append parts to existing Artifacts. A single Task can generate many Artifacts. For example, "create a webpage" could create separate HTML and image Artifacts. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class Artifact(TypedDict): """Agents generate Artifacts as an end result of a Task. Artifacts are immutable, can be named, and can have multiple parts. A streaming response can append parts to existing Artifacts. A single Task can generate many Artifacts. For example, "create a webpage" could create separate HTML and image Artifacts. """ artifact_id: str """Unique identifier for the artifact.""" name: NotRequired[str] """The name of the artifact.""" description: NotRequired[str] """A description of the artifact.""" parts: list[Part] """The parts that make up the artifact.""" metadata: NotRequired[dict[str, Any]] """Metadata about the artifact.""" extensions: NotRequired[list[str]] """Array of extensions.""" append: NotRequired[bool] """Whether to append this artifact to an existing one.""" last_chunk: NotRequired[bool] """Whether this is the last chunk of the artifact.""" ``` #### artifact_id ```python artifact_id: str ``` Unique identifier for the artifact. #### name ```python name: NotRequired[str] ``` The name of the artifact. #### description ```python description: NotRequired[str] ``` A description of the artifact. #### parts ```python parts: list[Part] ``` The parts that make up the artifact. #### metadata ```python metadata: NotRequired[dict[str, Any]] ``` Metadata about the artifact. #### extensions ```python extensions: NotRequired[list[str]] ``` Array of extensions. #### append ```python append: NotRequired[bool] ``` Whether to append this artifact to an existing one. #### last_chunk ```python last_chunk: NotRequired[bool] ``` Whether this is the last chunk of the artifact. ### PushNotificationConfig Bases: `TypedDict` Configuration for push notifications. A2A supports a secure notification mechanism whereby an agent can notify a client of an update outside of a connected session via a PushNotificationService. Within and across enterprises, it is critical that the agent verifies the identity of the notification service, authenticates itself with the service, and presents an identifier that ties the notification to the executing Task. The target server of the PushNotificationService should be considered a separate service, and is not guaranteed (or even expected) to be the client directly. This PushNotificationService is responsible for authenticating and authorizing the agent and for proxying the verified notification to the appropriate endpoint (which could be anything from a pub/sub queue, to an email inbox or other service, etc). For contrived scenarios with isolated client-agent pairs (e.g. local service mesh in a contained VPC, etc.) or isolated environments without enterprise security concerns, the client may choose to simply open a port and act as its own PushNotificationService. Any enterprise implementation will likely have a centralized service that authenticates the remote agents with trusted notification credentials and can handle online/offline scenarios. (This should be thought of similarly to a mobile Push Notification Service). Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class PushNotificationConfig(TypedDict): """Configuration for push notifications. A2A supports a secure notification mechanism whereby an agent can notify a client of an update outside of a connected session via a PushNotificationService. Within and across enterprises, it is critical that the agent verifies the identity of the notification service, authenticates itself with the service, and presents an identifier that ties the notification to the executing Task. The target server of the PushNotificationService should be considered a separate service, and is not guaranteed (or even expected) to be the client directly. This PushNotificationService is responsible for authenticating and authorizing the agent and for proxying the verified notification to the appropriate endpoint (which could be anything from a pub/sub queue, to an email inbox or other service, etc). For contrived scenarios with isolated client-agent pairs (e.g. local service mesh in a contained VPC, etc.) or isolated environments without enterprise security concerns, the client may choose to simply open a port and act as its own PushNotificationService. Any enterprise implementation will likely have a centralized service that authenticates the remote agents with trusted notification credentials and can handle online/offline scenarios. (This should be thought of similarly to a mobile Push Notification Service). """ id: NotRequired[str] """Server-assigned identifier.""" url: str """The URL to send push notifications to.""" token: NotRequired[str] """Token unique to this task/session.""" authentication: NotRequired[SecurityScheme] """Authentication details for push notifications.""" ``` #### id ```python id: NotRequired[str] ``` Server-assigned identifier. #### url ```python url: str ``` The URL to send push notifications to. #### token ```python token: NotRequired[str] ``` Token unique to this task/session. #### authentication ```python authentication: NotRequired[SecurityScheme] ``` Authentication details for push notifications. ### TaskPushNotificationConfig Bases: `TypedDict` Configuration for task push notifications. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class TaskPushNotificationConfig(TypedDict): """Configuration for task push notifications.""" id: str """The task id.""" push_notification_config: PushNotificationConfig """The push notification configuration.""" ``` #### id ```python id: str ``` The task id. #### push_notification_config ```python push_notification_config: PushNotificationConfig ``` The push notification configuration. ### Message Bases: `TypedDict` A Message contains any content that is not an Artifact. This can include things like agent thoughts, user context, instructions, errors, status, or metadata. All content from a client comes in the form of a Message. Agents send Messages to communicate status or to provide instructions (whereas generated results are sent as Artifacts). A Message can have multiple parts to denote different pieces of content. For example, a user request could include a textual description from a user and then multiple files used as context from the client. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class Message(TypedDict): """A Message contains any content that is not an Artifact. This can include things like agent thoughts, user context, instructions, errors, status, or metadata. All content from a client comes in the form of a Message. Agents send Messages to communicate status or to provide instructions (whereas generated results are sent as Artifacts). A Message can have multiple parts to denote different pieces of content. For example, a user request could include a textual description from a user and then multiple files used as context from the client. """ role: Literal['user', 'agent'] """The role of the message.""" parts: list[Part] """The parts of the message.""" kind: Literal['message'] """Event type.""" metadata: NotRequired[dict[str, Any]] """Metadata about the message.""" # Additional fields message_id: str """Identifier created by the message creator.""" context_id: NotRequired[str] """The context the message is associated with.""" task_id: NotRequired[str] """Identifier of task the message is related to.""" reference_task_ids: NotRequired[list[str]] """Array of task IDs this message references.""" extensions: NotRequired[list[str]] """Array of extensions.""" ``` #### role ```python role: Literal['user', 'agent'] ``` The role of the message. #### parts ```python parts: list[Part] ``` The parts of the message. #### kind ```python kind: Literal['message'] ``` Event type. #### metadata ```python metadata: NotRequired[dict[str, Any]] ``` Metadata about the message. #### message_id ```python message_id: str ``` Identifier created by the message creator. #### context_id ```python context_id: NotRequired[str] ``` The context the message is associated with. #### task_id ```python task_id: NotRequired[str] ``` Identifier of task the message is related to. #### reference_task_ids ```python reference_task_ids: NotRequired[list[str]] ``` Array of task IDs this message references. #### extensions ```python extensions: NotRequired[list[str]] ``` Array of extensions. ### TextPart Bases: `_BasePart` A part that contains text. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class TextPart(_BasePart): """A part that contains text.""" kind: Literal['text'] """The kind of the part.""" text: str """The text of the part.""" ``` #### kind ```python kind: Literal['text'] ``` The kind of the part. #### text ```python text: str ``` The text of the part. ### FileWithBytes Bases: `TypedDict` File with base64 encoded data. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class FileWithBytes(TypedDict): """File with base64 encoded data.""" bytes: str """The base64 encoded content of the file.""" mime_type: NotRequired[str] """Optional mime type for the file.""" ``` #### bytes ```python bytes: str ``` The base64 encoded content of the file. #### mime_type ```python mime_type: NotRequired[str] ``` Optional mime type for the file. ### FileWithUri Bases: `TypedDict` File with URI reference. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class FileWithUri(TypedDict): """File with URI reference.""" uri: str """The URI of the file.""" mime_type: NotRequired[str] """The mime type of the file.""" ``` #### uri ```python uri: str ``` The URI of the file. #### mime_type ```python mime_type: NotRequired[str] ``` The mime type of the file. ### FilePart Bases: `_BasePart` A part that contains a file. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class FilePart(_BasePart): """A part that contains a file.""" kind: Literal['file'] """The kind of the part.""" file: FileWithBytes | FileWithUri """The file content - either bytes or URI.""" ``` #### kind ```python kind: Literal['file'] ``` The kind of the part. #### file ```python file: FileWithBytes | FileWithUri ``` The file content - either bytes or URI. ### DataPart Bases: `_BasePart` A part that contains structured data. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class DataPart(_BasePart): """A part that contains structured data.""" kind: Literal['data'] """The kind of the part.""" data: dict[str, Any] """The data of the part.""" ``` #### kind ```python kind: Literal['data'] ``` The kind of the part. #### data ```python data: dict[str, Any] ``` The data of the part. ### Part ```python Part = Annotated[ Union[TextPart, FilePart, DataPart], Field(discriminator="kind"), ] ``` A fully formed piece of content exchanged between a client and a remote agent as part of a Message or an Artifact. Each Part has its own content type and metadata. ### TaskState ```python TaskState: TypeAlias = Literal[ "submitted", "working", "input-required", "completed", "canceled", "failed", "rejected", "auth-required", "unknown", ] ``` The possible states of a task. ### TaskStatus Bases: `TypedDict` Status and accompanying message for a task. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class TaskStatus(TypedDict): """Status and accompanying message for a task.""" state: TaskState """The current state of the task.""" message: NotRequired[Message] """Additional status updates for client.""" timestamp: NotRequired[str] """ISO datetime value of when the status was updated.""" ``` #### state ```python state: TaskState ``` The current state of the task. #### message ```python message: NotRequired[Message] ``` Additional status updates for client. #### timestamp ```python timestamp: NotRequired[str] ``` ISO datetime value of when the status was updated. ### Task Bases: `TypedDict` A Task is a stateful entity that allows Clients and Remote Agents to achieve a specific outcome. Clients and Remote Agents exchange Messages within a Task. Remote Agents generate results as Artifacts. A Task is always created by a Client and the status is always determined by the Remote Agent. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class Task(TypedDict): """A Task is a stateful entity that allows Clients and Remote Agents to achieve a specific outcome. Clients and Remote Agents exchange Messages within a Task. Remote Agents generate results as Artifacts. A Task is always created by a Client and the status is always determined by the Remote Agent. """ id: str """Unique identifier for the task.""" context_id: str """The context the task is associated with.""" kind: Literal['task'] """Event type.""" status: TaskStatus """Current status of the task.""" history: NotRequired[list[Message]] """Optional history of messages.""" artifacts: NotRequired[list[Artifact]] """Collection of artifacts created by the agent.""" metadata: NotRequired[dict[str, Any]] """Extension metadata.""" ``` #### id ```python id: str ``` Unique identifier for the task. #### context_id ```python context_id: str ``` The context the task is associated with. #### kind ```python kind: Literal['task'] ``` Event type. #### status ```python status: TaskStatus ``` Current status of the task. #### history ```python history: NotRequired[list[Message]] ``` Optional history of messages. #### artifacts ```python artifacts: NotRequired[list[Artifact]] ``` Collection of artifacts created by the agent. #### metadata ```python metadata: NotRequired[dict[str, Any]] ``` Extension metadata. ### TaskStatusUpdateEvent Bases: `TypedDict` Sent by server during message/stream requests. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class TaskStatusUpdateEvent(TypedDict): """Sent by server during message/stream requests.""" task_id: str """The id of the task.""" context_id: str """The context the task is associated with.""" kind: Literal['status-update'] """Event type.""" status: TaskStatus """The status of the task.""" final: bool """Indicates the end of the event stream.""" metadata: NotRequired[dict[str, Any]] """Extension metadata.""" ``` #### task_id ```python task_id: str ``` The id of the task. #### context_id ```python context_id: str ``` The context the task is associated with. #### kind ```python kind: Literal['status-update'] ``` Event type. #### status ```python status: TaskStatus ``` The status of the task. #### final ```python final: bool ``` Indicates the end of the event stream. #### metadata ```python metadata: NotRequired[dict[str, Any]] ``` Extension metadata. ### TaskArtifactUpdateEvent Bases: `TypedDict` Sent by server during message/stream requests. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class TaskArtifactUpdateEvent(TypedDict): """Sent by server during message/stream requests.""" task_id: str """The id of the task.""" context_id: str """The context the task is associated with.""" kind: Literal['artifact-update'] """Event type identification.""" artifact: Artifact """The artifact that was updated.""" append: NotRequired[bool] """Whether to append to existing artifact (true) or replace (false).""" last_chunk: NotRequired[bool] """Indicates this is the final chunk of the artifact.""" metadata: NotRequired[dict[str, Any]] """Extension metadata.""" ``` #### task_id ```python task_id: str ``` The id of the task. #### context_id ```python context_id: str ``` The context the task is associated with. #### kind ```python kind: Literal['artifact-update'] ``` Event type identification. #### artifact ```python artifact: Artifact ``` The artifact that was updated. #### append ```python append: NotRequired[bool] ``` Whether to append to existing artifact (true) or replace (false). #### last_chunk ```python last_chunk: NotRequired[bool] ``` Indicates this is the final chunk of the artifact. #### metadata ```python metadata: NotRequired[dict[str, Any]] ``` Extension metadata. ### TaskIdParams Bases: `TypedDict` Parameters for a task id. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class TaskIdParams(TypedDict): """Parameters for a task id.""" id: str metadata: NotRequired[dict[str, Any]] ``` ### TaskQueryParams Bases: `TaskIdParams` Query parameters for a task. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class TaskQueryParams(TaskIdParams): """Query parameters for a task.""" history_length: NotRequired[int] """Number of recent messages to be retrieved.""" ``` #### history_length ```python history_length: NotRequired[int] ``` Number of recent messages to be retrieved. ### MessageSendConfiguration Bases: `TypedDict` Configuration for the send message request. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class MessageSendConfiguration(TypedDict): """Configuration for the send message request.""" accepted_output_modes: list[str] """Accepted output modalities by the client.""" blocking: NotRequired[bool] """If the server should treat the client as a blocking request.""" history_length: NotRequired[int] """Number of recent messages to be retrieved.""" push_notification_config: NotRequired[PushNotificationConfig] """Where the server should send notifications when disconnected.""" ``` #### accepted_output_modes ```python accepted_output_modes: list[str] ``` Accepted output modalities by the client. #### blocking ```python blocking: NotRequired[bool] ``` If the server should treat the client as a blocking request. #### history_length ```python history_length: NotRequired[int] ``` Number of recent messages to be retrieved. #### push_notification_config ```python push_notification_config: NotRequired[ PushNotificationConfig ] ``` Where the server should send notifications when disconnected. ### MessageSendParams Bases: `TypedDict` Parameters for message/send method. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class MessageSendParams(TypedDict): """Parameters for message/send method.""" configuration: NotRequired[MessageSendConfiguration] """Send message configuration.""" message: Message """The message being sent to the server.""" metadata: NotRequired[dict[str, Any]] """Extension metadata.""" ``` #### configuration ```python configuration: NotRequired[MessageSendConfiguration] ``` Send message configuration. #### message ```python message: Message ``` The message being sent to the server. #### metadata ```python metadata: NotRequired[dict[str, Any]] ``` Extension metadata. ### TaskSendParams Bases: `TypedDict` Internal parameters for task execution within the framework. Note: This is not part of the A2A protocol - it's used internally for broker/worker communication. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class TaskSendParams(TypedDict): """Internal parameters for task execution within the framework. Note: This is not part of the A2A protocol - it's used internally for broker/worker communication. """ id: str """The id of the task.""" context_id: str """The context id for the task.""" message: Message """The message to process.""" history_length: NotRequired[int] """Number of recent messages to be retrieved.""" metadata: NotRequired[dict[str, Any]] """Extension metadata.""" ``` #### id ```python id: str ``` The id of the task. #### context_id ```python context_id: str ``` The context id for the task. #### message ```python message: Message ``` The message to process. #### history_length ```python history_length: NotRequired[int] ``` Number of recent messages to be retrieved. #### metadata ```python metadata: NotRequired[dict[str, Any]] ``` Extension metadata. ### ListTaskPushNotificationConfigParams Bases: `TypedDict` Parameters for getting list of pushNotificationConfigurations associated with a Task. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class ListTaskPushNotificationConfigParams(TypedDict): """Parameters for getting list of pushNotificationConfigurations associated with a Task.""" id: str """Task id.""" metadata: NotRequired[dict[str, Any]] """Extension metadata.""" ``` #### id ```python id: str ``` Task id. #### metadata ```python metadata: NotRequired[dict[str, Any]] ``` Extension metadata. ### DeleteTaskPushNotificationConfigParams Bases: `TypedDict` Parameters for removing pushNotificationConfiguration associated with a Task. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class DeleteTaskPushNotificationConfigParams(TypedDict): """Parameters for removing pushNotificationConfiguration associated with a Task.""" id: str """Task id.""" push_notification_config_id: str """The push notification config id to delete.""" metadata: NotRequired[dict[str, Any]] """Extension metadata.""" ``` #### id ```python id: str ``` Task id. #### push_notification_config_id ```python push_notification_config_id: str ``` The push notification config id to delete. #### metadata ```python metadata: NotRequired[dict[str, Any]] ``` Extension metadata. ### JSONRPCMessage Bases: `TypedDict` A JSON RPC message. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python class JSONRPCMessage(TypedDict): """A JSON RPC message.""" jsonrpc: Literal['2.0'] """The JSON RPC version.""" id: int | str | None """The request id.""" ``` #### jsonrpc ```python jsonrpc: Literal['2.0'] ``` The JSON RPC version. #### id ```python id: int | str | None ``` The request id. ### JSONRPCRequest Bases: `JSONRPCMessage`, `Generic[Method, Params]` A JSON RPC request. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python class JSONRPCRequest(JSONRPCMessage, Generic[Method, Params]): """A JSON RPC request.""" method: Method """The method to call.""" params: Params """The parameters to pass to the method.""" ``` #### method ```python method: Method ``` The method to call. #### params ```python params: Params ``` The parameters to pass to the method. ### JSONRPCError Bases: `TypedDict`, `Generic[CodeT, MessageT]` A JSON RPC error. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python class JSONRPCError(TypedDict, Generic[CodeT, MessageT]): """A JSON RPC error.""" code: CodeT message: MessageT data: NotRequired[Any] ``` ### JSONRPCResponse Bases: `JSONRPCMessage`, `Generic[ResultT, ErrorT]` A JSON RPC response. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python class JSONRPCResponse(JSONRPCMessage, Generic[ResultT, ErrorT]): """A JSON RPC response.""" result: NotRequired[ResultT] error: NotRequired[ErrorT] ``` ### JSONParseError ```python JSONParseError = JSONRPCError[ Literal[-32700], Literal["Invalid JSON payload"] ] ``` A JSON RPC error for a parse error. ### InvalidRequestError ```python InvalidRequestError = JSONRPCError[ Literal[-32600], Literal["Request payload validation error"], ] ``` A JSON RPC error for an invalid request. ### MethodNotFoundError ```python MethodNotFoundError = JSONRPCError[ Literal[-32601], Literal["Method not found"] ] ``` A JSON RPC error for a method not found. ### InvalidParamsError ```python InvalidParamsError = JSONRPCError[ Literal[-32602], Literal["Invalid parameters"] ] ``` A JSON RPC error for invalid parameters. ### InternalError ```python InternalError = JSONRPCError[ Literal[-32603], Literal["Internal error"] ] ``` A JSON RPC error for an internal error. ### TaskNotFoundError ```python TaskNotFoundError = JSONRPCError[ Literal[-32001], Literal["Task not found"] ] ``` A JSON RPC error for a task not found. ### TaskNotCancelableError ```python TaskNotCancelableError = JSONRPCError[ Literal[-32002], Literal["Task not cancelable"] ] ``` A JSON RPC error for a task not cancelable. ### PushNotificationNotSupportedError ```python PushNotificationNotSupportedError = JSONRPCError[ Literal[-32003], Literal["Push notification not supported"], ] ``` A JSON RPC error for a push notification not supported. ### UnsupportedOperationError ```python UnsupportedOperationError = JSONRPCError[ Literal[-32004], Literal["This operation is not supported"], ] ``` A JSON RPC error for an unsupported operation. ### ContentTypeNotSupportedError ```python ContentTypeNotSupportedError = JSONRPCError[ Literal[-32005], Literal["Incompatible content types"] ] ``` A JSON RPC error for incompatible content types. ### InvalidAgentResponseError ```python InvalidAgentResponseError = JSONRPCError[ Literal[-32006], Literal["Invalid agent response"] ] ``` A JSON RPC error for invalid agent response. ### SendMessageRequest ```python SendMessageRequest = JSONRPCRequest[ Literal["message/send"], MessageSendParams ] ``` A JSON RPC request to send a message. ### SendMessageResponse ```python SendMessageResponse = JSONRPCResponse[ Union[Task, Message], JSONRPCError[Any, Any] ] ``` A JSON RPC response to send a message. ### StreamMessageRequest ```python StreamMessageRequest = JSONRPCRequest[ Literal["message/stream"], MessageSendParams ] ``` A JSON RPC request to stream a message. ### GetTaskRequest ```python GetTaskRequest = JSONRPCRequest[ Literal["tasks/get"], TaskQueryParams ] ``` A JSON RPC request to get a task. ### GetTaskResponse ```python GetTaskResponse = JSONRPCResponse[Task, TaskNotFoundError] ``` A JSON RPC response to get a task. ### CancelTaskRequest ```python CancelTaskRequest = JSONRPCRequest[ Literal["tasks/cancel"], TaskIdParams ] ``` A JSON RPC request to cancel a task. ### CancelTaskResponse ```python CancelTaskResponse = JSONRPCResponse[ Task, Union[TaskNotCancelableError, TaskNotFoundError] ] ``` A JSON RPC response to cancel a task. ### SetTaskPushNotificationRequest ```python SetTaskPushNotificationRequest = JSONRPCRequest[ Literal["tasks/pushNotification/set"], TaskPushNotificationConfig, ] ``` A JSON RPC request to set a task push notification. ### SetTaskPushNotificationResponse ```python SetTaskPushNotificationResponse = JSONRPCResponse[ TaskPushNotificationConfig, PushNotificationNotSupportedError, ] ``` A JSON RPC response to set a task push notification. ### GetTaskPushNotificationRequest ```python GetTaskPushNotificationRequest = JSONRPCRequest[ Literal["tasks/pushNotification/get"], TaskIdParams ] ``` A JSON RPC request to get a task push notification. ### GetTaskPushNotificationResponse ```python GetTaskPushNotificationResponse = JSONRPCResponse[ TaskPushNotificationConfig, PushNotificationNotSupportedError, ] ``` A JSON RPC response to get a task push notification. ### ResubscribeTaskRequest ```python ResubscribeTaskRequest = JSONRPCRequest[ Literal["tasks/resubscribe"], TaskIdParams ] ``` A JSON RPC request to resubscribe to a task. ### ListTaskPushNotificationConfigRequest ```python ListTaskPushNotificationConfigRequest = JSONRPCRequest[ Literal["tasks/pushNotificationConfig/list"], ListTaskPushNotificationConfigParams, ] ``` A JSON RPC request to list task push notification configs. ### DeleteTaskPushNotificationConfigRequest ```python DeleteTaskPushNotificationConfigRequest = JSONRPCRequest[ Literal["tasks/pushNotificationConfig/delete"], DeleteTaskPushNotificationConfigParams, ] ``` A JSON RPC request to delete a task push notification config. ### A2ARequest ```python A2ARequest = Annotated[ Union[ SendMessageRequest, StreamMessageRequest, GetTaskRequest, CancelTaskRequest, SetTaskPushNotificationRequest, GetTaskPushNotificationRequest, ResubscribeTaskRequest, ListTaskPushNotificationConfigRequest, DeleteTaskPushNotificationConfigRequest, ], Discriminator("method"), ] ``` A JSON RPC request to the A2A server. ### A2AResponse ```python A2AResponse: TypeAlias = Union[ SendMessageResponse, GetTaskResponse, CancelTaskResponse, SetTaskPushNotificationResponse, GetTaskPushNotificationResponse, ] ``` A JSON RPC response from the A2A server. ### A2AClient A client for the A2A protocol. Source code in `.venv/lib/python3.12/site-packages/fasta2a/client.py` ```python class A2AClient: """A client for the A2A protocol.""" def __init__(self, base_url: str = 'http://localhost:8000', http_client: httpx.AsyncClient | None = None) -> None: if http_client is None: self.http_client = httpx.AsyncClient(base_url=base_url) else: self.http_client = http_client self.http_client.base_url = base_url async def send_message( self, message: Message, *, metadata: dict[str, Any] | None = None, configuration: MessageSendConfiguration | None = None, ) -> SendMessageResponse: """Send a message using the A2A protocol. Returns a JSON-RPC response containing either a result (Task) or an error. """ params = MessageSendParams(message=message) if metadata is not None: params['metadata'] = metadata if configuration is not None: params['configuration'] = configuration request_id = str(uuid.uuid4()) payload = SendMessageRequest(jsonrpc='2.0', id=request_id, method='message/send', params=params) content = send_message_request_ta.dump_json(payload, by_alias=True) response = await self.http_client.post('/', content=content, headers={'Content-Type': 'application/json'}) self._raise_for_status(response) return send_message_response_ta.validate_json(response.content) async def get_task(self, task_id: str) -> GetTaskResponse: payload = GetTaskRequest(jsonrpc='2.0', id=None, method='tasks/get', params={'id': task_id}) content = a2a_request_ta.dump_json(payload, by_alias=True) response = await self.http_client.post('/', content=content, headers={'Content-Type': 'application/json'}) self._raise_for_status(response) return get_task_response_ta.validate_json(response.content) def _raise_for_status(self, response: httpx.Response) -> None: if response.status_code >= 400: raise UnexpectedResponseError(response.status_code, response.text) ``` #### send_message ```python send_message( message: Message, *, metadata: dict[str, Any] | None = None, configuration: MessageSendConfiguration | None = None ) -> SendMessageResponse ``` Send a message using the A2A protocol. Returns a JSON-RPC response containing either a result (Task) or an error. Source code in `.venv/lib/python3.12/site-packages/fasta2a/client.py` ```python async def send_message( self, message: Message, *, metadata: dict[str, Any] | None = None, configuration: MessageSendConfiguration | None = None, ) -> SendMessageResponse: """Send a message using the A2A protocol. Returns a JSON-RPC response containing either a result (Task) or an error. """ params = MessageSendParams(message=message) if metadata is not None: params['metadata'] = metadata if configuration is not None: params['configuration'] = configuration request_id = str(uuid.uuid4()) payload = SendMessageRequest(jsonrpc='2.0', id=request_id, method='message/send', params=params) content = send_message_request_ta.dump_json(payload, by_alias=True) response = await self.http_client.post('/', content=content, headers={'Content-Type': 'application/json'}) self._raise_for_status(response) return send_message_response_ta.validate_json(response.content) ``` ### UnexpectedResponseError Bases: `Exception` An error raised when an unexpected response is received from the server. Source code in `.venv/lib/python3.12/site-packages/fasta2a/client.py` ```python class UnexpectedResponseError(Exception): """An error raised when an unexpected response is received from the server.""" def __init__(self, status_code: int, content: str) -> None: self.status_code = status_code self.content = content ``` # `pydantic_ai.format_prompt` ### format_as_xml ```python format_as_xml( obj: Any, root_tag: str | None = None, item_tag: str = "item", none_str: str = "null", indent: str | None = " ", ) -> str ``` Format a Python object as XML. This is useful since LLMs often find it easier to read semi-structured data (e.g. examples) as XML, rather than JSON etc. Supports: `str`, `bytes`, `bytearray`, `bool`, `int`, `float`, `date`, `datetime`, `Mapping`, `Iterable`, `dataclass`, and `BaseModel`. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `obj` | `Any` | Python Object to serialize to XML. | *required* | | `root_tag` | `str | None` | Outer tag to wrap the XML in, use None to omit the outer tag. | `None` | | `item_tag` | `str` | Tag to use for each item in an iterable (e.g. list), this is overridden by the class name for dataclasses and Pydantic models. | `'item'` | | `none_str` | `str` | String to use for None values. | `'null'` | | `indent` | `str | None` | Indentation string to use for pretty printing. | `' '` | Returns: | Type | Description | | --- | --- | | `str` | XML representation of the object. | Example: format_as_xml_example.py ```python from pydantic_ai import format_as_xml print(format_as_xml({'name': 'John', 'height': 6, 'weight': 200}, root_tag='user')) ''' John 6 200 ''' ``` Source code in `pydantic_ai_slim/pydantic_ai/format_prompt.py` ````python def format_as_xml( obj: Any, root_tag: str | None = None, item_tag: str = 'item', none_str: str = 'null', indent: str | None = ' ', ) -> str: """Format a Python object as XML. This is useful since LLMs often find it easier to read semi-structured data (e.g. examples) as XML, rather than JSON etc. Supports: `str`, `bytes`, `bytearray`, `bool`, `int`, `float`, `date`, `datetime`, `Mapping`, `Iterable`, `dataclass`, and `BaseModel`. Args: obj: Python Object to serialize to XML. root_tag: Outer tag to wrap the XML in, use `None` to omit the outer tag. item_tag: Tag to use for each item in an iterable (e.g. list), this is overridden by the class name for dataclasses and Pydantic models. none_str: String to use for `None` values. indent: Indentation string to use for pretty printing. Returns: XML representation of the object. Example: ```python {title="format_as_xml_example.py" lint="skip"} from pydantic_ai import format_as_xml print(format_as_xml({'name': 'John', 'height': 6, 'weight': 200}, root_tag='user')) ''' John 6 200 ''' ``` """ el = _ToXml(item_tag=item_tag, none_str=none_str).to_xml(obj, root_tag) if root_tag is None and el.text is None: join = '' if indent is None else '\n' return join.join(_rootless_xml_elements(el, indent)) else: if indent is not None: ElementTree.indent(el, space=indent) return ElementTree.tostring(el, encoding='unicode') ```` # `pydantic_ai.mcp` ### MCPServer Bases: `AbstractToolset[Any]`, `ABC` Base class for attaching agents to MCP servers. See for more information. Source code in `pydantic_ai_slim/pydantic_ai/mcp.py` ```python class MCPServer(AbstractToolset[Any], ABC): """Base class for attaching agents to MCP servers. See for more information. """ tool_prefix: str | None """A prefix to add to all tools that are registered with the server. If not empty, will include a trailing underscore(`_`). e.g. if `tool_prefix='foo'`, then a tool named `bar` will be registered as `foo_bar` """ log_level: mcp_types.LoggingLevel | None """The log level to set when connecting to the server, if any. See for more details. If `None`, no log level will be set. """ log_handler: LoggingFnT | None """A handler for logging messages from the server.""" timeout: float """The timeout in seconds to wait for the client to initialize.""" read_timeout: float """Maximum time in seconds to wait for new messages before timing out. This timeout applies to the long-lived connection after it's established. If no new messages are received within this time, the connection will be considered stale and may be closed. Defaults to 5 minutes (300 seconds). """ process_tool_call: ProcessToolCallback | None """Hook to customize tool calling and optionally pass extra metadata.""" allow_sampling: bool """Whether to allow MCP sampling through this client.""" sampling_model: models.Model | None """The model to use for sampling.""" max_retries: int """The maximum number of times to retry a tool call.""" elicitation_callback: ElicitationFnT | None = None """Callback function to handle elicitation requests from the server.""" _id: str | None _enter_lock: Lock = field(compare=False) _running_count: int _exit_stack: AsyncExitStack | None _client: ClientSession _read_stream: MemoryObjectReceiveStream[SessionMessage | Exception] _write_stream: MemoryObjectSendStream[SessionMessage] def __init__( self, tool_prefix: str | None = None, log_level: mcp_types.LoggingLevel | None = None, log_handler: LoggingFnT | None = None, timeout: float = 5, read_timeout: float = 5 * 60, process_tool_call: ProcessToolCallback | None = None, allow_sampling: bool = True, sampling_model: models.Model | None = None, max_retries: int = 1, elicitation_callback: ElicitationFnT | None = None, *, id: str | None = None, ): self.tool_prefix = tool_prefix self.log_level = log_level self.log_handler = log_handler self.timeout = timeout self.read_timeout = read_timeout self.process_tool_call = process_tool_call self.allow_sampling = allow_sampling self.sampling_model = sampling_model self.max_retries = max_retries self.elicitation_callback = elicitation_callback self._id = id or tool_prefix self.__post_init__() def __post_init__(self): self._enter_lock = Lock() self._running_count = 0 self._exit_stack = None @abstractmethod @asynccontextmanager async def client_streams( self, ) -> AsyncIterator[ tuple[ MemoryObjectReceiveStream[SessionMessage | Exception], MemoryObjectSendStream[SessionMessage], ] ]: """Create the streams for the MCP server.""" raise NotImplementedError('MCP Server subclasses must implement this method.') yield @property def id(self) -> str | None: return self._id @property def label(self) -> str: if self.id: return super().label # pragma: no cover else: return repr(self) @property def tool_name_conflict_hint(self) -> str: return 'Set the `tool_prefix` attribute to avoid name conflicts.' async def list_tools(self) -> list[mcp_types.Tool]: """Retrieve tools that are currently active on the server. Note: - We don't cache tools as they might change. - We also don't subscribe to the server to avoid complexity. """ async with self: # Ensure server is running result = await self._client.list_tools() return result.tools async def direct_call_tool( self, name: str, args: dict[str, Any], metadata: dict[str, Any] | None = None, ) -> ToolResult: """Call a tool on the server. Args: name: The name of the tool to call. args: The arguments to pass to the tool. metadata: Request-level metadata (optional) Returns: The result of the tool call. Raises: ModelRetry: If the tool call fails. """ async with self: # Ensure server is running try: result = await self._client.send_request( mcp_types.ClientRequest( mcp_types.CallToolRequest( method='tools/call', params=mcp_types.CallToolRequestParams( name=name, arguments=args, _meta=mcp_types.RequestParams.Meta(**metadata) if metadata else None, ), ) ), mcp_types.CallToolResult, ) except McpError as e: raise exceptions.ModelRetry(e.error.message) content = [await self._map_tool_result_part(part) for part in result.content] if result.isError: text = '\n'.join(str(part) for part in content) raise exceptions.ModelRetry(text) else: return content[0] if len(content) == 1 else content async def call_tool( self, name: str, tool_args: dict[str, Any], ctx: RunContext[Any], tool: ToolsetTool[Any], ) -> ToolResult: if self.tool_prefix: name = name.removeprefix(f'{self.tool_prefix}_') ctx = replace(ctx, tool_name=name) if self.process_tool_call is not None: return await self.process_tool_call(ctx, self.direct_call_tool, name, tool_args) else: return await self.direct_call_tool(name, tool_args) async def get_tools(self, ctx: RunContext[Any]) -> dict[str, ToolsetTool[Any]]: return { name: self.tool_for_tool_def( ToolDefinition( name=name, description=mcp_tool.description, parameters_json_schema=mcp_tool.inputSchema, ), ) for mcp_tool in await self.list_tools() if (name := f'{self.tool_prefix}_{mcp_tool.name}' if self.tool_prefix else mcp_tool.name) } def tool_for_tool_def(self, tool_def: ToolDefinition) -> ToolsetTool[Any]: return ToolsetTool( toolset=self, tool_def=tool_def, max_retries=self.max_retries, args_validator=TOOL_SCHEMA_VALIDATOR, ) async def __aenter__(self) -> Self: """Enter the MCP server context. This will initialize the connection to the server. If this server is an [`MCPServerStdio`][pydantic_ai.mcp.MCPServerStdio], the server will first be started as a subprocess. This is a no-op if the MCP server has already been entered. """ async with self._enter_lock: if self._running_count == 0: async with AsyncExitStack() as exit_stack: self._read_stream, self._write_stream = await exit_stack.enter_async_context(self.client_streams()) client = ClientSession( read_stream=self._read_stream, write_stream=self._write_stream, sampling_callback=self._sampling_callback if self.allow_sampling else None, elicitation_callback=self.elicitation_callback, logging_callback=self.log_handler, read_timeout_seconds=timedelta(seconds=self.read_timeout), ) self._client = await exit_stack.enter_async_context(client) with anyio.fail_after(self.timeout): await self._client.initialize() if log_level := self.log_level: await self._client.set_logging_level(log_level) self._exit_stack = exit_stack.pop_all() self._running_count += 1 return self async def __aexit__(self, *args: Any) -> bool | None: if self._running_count == 0: raise ValueError('MCPServer.__aexit__ called more times than __aenter__') async with self._enter_lock: self._running_count -= 1 if self._running_count == 0 and self._exit_stack is not None: await self._exit_stack.aclose() self._exit_stack = None @property def is_running(self) -> bool: """Check if the MCP server is running.""" return bool(self._running_count) async def _sampling_callback( self, context: RequestContext[ClientSession, Any], params: mcp_types.CreateMessageRequestParams ) -> mcp_types.CreateMessageResult | mcp_types.ErrorData: """MCP sampling callback.""" if self.sampling_model is None: raise ValueError('Sampling model is not set') # pragma: no cover pai_messages = _mcp.map_from_mcp_params(params) model_settings = models.ModelSettings() if max_tokens := params.maxTokens: # pragma: no branch model_settings['max_tokens'] = max_tokens if temperature := params.temperature: # pragma: no branch model_settings['temperature'] = temperature if stop_sequences := params.stopSequences: # pragma: no branch model_settings['stop_sequences'] = stop_sequences model_response = await model_request(self.sampling_model, pai_messages, model_settings=model_settings) return mcp_types.CreateMessageResult( role='assistant', content=_mcp.map_from_model_response(model_response), model=self.sampling_model.model_name, ) async def _map_tool_result_part( self, part: mcp_types.ContentBlock ) -> str | messages.BinaryContent | dict[str, Any] | list[Any]: # See https://github.com/jlowin/fastmcp/blob/main/docs/servers/tools.mdx#return-values if isinstance(part, mcp_types.TextContent): text = part.text if text.startswith(('[', '{')): try: return pydantic_core.from_json(text) except ValueError: pass return text elif isinstance(part, mcp_types.ImageContent): return messages.BinaryContent(data=base64.b64decode(part.data), media_type=part.mimeType) elif isinstance(part, mcp_types.AudioContent): # NOTE: The FastMCP server doesn't support audio content. # See for more details. return messages.BinaryContent( data=base64.b64decode(part.data), media_type=part.mimeType ) # pragma: no cover elif isinstance(part, mcp_types.EmbeddedResource): resource = part.resource return self._get_content(resource) elif isinstance(part, mcp_types.ResourceLink): resource_result: mcp_types.ReadResourceResult = await self._client.read_resource(part.uri) return ( self._get_content(resource_result.contents[0]) if len(resource_result.contents) == 1 else [self._get_content(resource) for resource in resource_result.contents] ) else: assert_never(part) def _get_content( self, resource: mcp_types.TextResourceContents | mcp_types.BlobResourceContents ) -> str | messages.BinaryContent: if isinstance(resource, mcp_types.TextResourceContents): return resource.text elif isinstance(resource, mcp_types.BlobResourceContents): return messages.BinaryContent( data=base64.b64decode(resource.blob), media_type=resource.mimeType or 'application/octet-stream' ) else: assert_never(resource) ``` #### tool_prefix ```python tool_prefix: str | None = tool_prefix ``` A prefix to add to all tools that are registered with the server. If not empty, will include a trailing underscore(`_`). e.g. if `tool_prefix='foo'`, then a tool named `bar` will be registered as `foo_bar` #### log_level ```python log_level: LoggingLevel | None = log_level ``` The log level to set when connecting to the server, if any. See for more details. If `None`, no log level will be set. #### log_handler ```python log_handler: LoggingFnT | None = log_handler ``` A handler for logging messages from the server. #### timeout ```python timeout: float = timeout ``` The timeout in seconds to wait for the client to initialize. #### read_timeout ```python read_timeout: float = read_timeout ``` Maximum time in seconds to wait for new messages before timing out. This timeout applies to the long-lived connection after it's established. If no new messages are received within this time, the connection will be considered stale and may be closed. Defaults to 5 minutes (300 seconds). #### process_tool_call ```python process_tool_call: ProcessToolCallback | None = ( process_tool_call ) ``` Hook to customize tool calling and optionally pass extra metadata. #### allow_sampling ```python allow_sampling: bool = allow_sampling ``` Whether to allow MCP sampling through this client. #### sampling_model ```python sampling_model: Model | None = sampling_model ``` The model to use for sampling. #### max_retries ```python max_retries: int = max_retries ``` The maximum number of times to retry a tool call. #### elicitation_callback ```python elicitation_callback: ElicitationFnT | None = ( elicitation_callback ) ``` Callback function to handle elicitation requests from the server. #### client_streams ```python client_streams() -> AsyncIterator[ tuple[ MemoryObjectReceiveStream[ SessionMessage | Exception ], MemoryObjectSendStream[SessionMessage], ] ] ``` Create the streams for the MCP server. Source code in `pydantic_ai_slim/pydantic_ai/mcp.py` ```python @abstractmethod @asynccontextmanager async def client_streams( self, ) -> AsyncIterator[ tuple[ MemoryObjectReceiveStream[SessionMessage | Exception], MemoryObjectSendStream[SessionMessage], ] ]: """Create the streams for the MCP server.""" raise NotImplementedError('MCP Server subclasses must implement this method.') yield ``` #### list_tools ```python list_tools() -> list[Tool] ``` Retrieve tools that are currently active on the server. Note: - We don't cache tools as they might change. - We also don't subscribe to the server to avoid complexity. Source code in `pydantic_ai_slim/pydantic_ai/mcp.py` ```python async def list_tools(self) -> list[mcp_types.Tool]: """Retrieve tools that are currently active on the server. Note: - We don't cache tools as they might change. - We also don't subscribe to the server to avoid complexity. """ async with self: # Ensure server is running result = await self._client.list_tools() return result.tools ``` #### direct_call_tool ```python direct_call_tool( name: str, args: dict[str, Any], metadata: dict[str, Any] | None = None, ) -> ToolResult ``` Call a tool on the server. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `name` | `str` | The name of the tool to call. | *required* | | `args` | `dict[str, Any]` | The arguments to pass to the tool. | *required* | | `metadata` | `dict[str, Any] | None` | Request-level metadata (optional) | `None` | Returns: | Type | Description | | --- | --- | | `ToolResult` | The result of the tool call. | Raises: | Type | Description | | --- | --- | | `ModelRetry` | If the tool call fails. | Source code in `pydantic_ai_slim/pydantic_ai/mcp.py` ```python async def direct_call_tool( self, name: str, args: dict[str, Any], metadata: dict[str, Any] | None = None, ) -> ToolResult: """Call a tool on the server. Args: name: The name of the tool to call. args: The arguments to pass to the tool. metadata: Request-level metadata (optional) Returns: The result of the tool call. Raises: ModelRetry: If the tool call fails. """ async with self: # Ensure server is running try: result = await self._client.send_request( mcp_types.ClientRequest( mcp_types.CallToolRequest( method='tools/call', params=mcp_types.CallToolRequestParams( name=name, arguments=args, _meta=mcp_types.RequestParams.Meta(**metadata) if metadata else None, ), ) ), mcp_types.CallToolResult, ) except McpError as e: raise exceptions.ModelRetry(e.error.message) content = [await self._map_tool_result_part(part) for part in result.content] if result.isError: text = '\n'.join(str(part) for part in content) raise exceptions.ModelRetry(text) else: return content[0] if len(content) == 1 else content ``` #### __aenter__ ```python __aenter__() -> Self ``` Enter the MCP server context. This will initialize the connection to the server. If this server is an MCPServerStdio, the server will first be started as a subprocess. This is a no-op if the MCP server has already been entered. Source code in `pydantic_ai_slim/pydantic_ai/mcp.py` ```python async def __aenter__(self) -> Self: """Enter the MCP server context. This will initialize the connection to the server. If this server is an [`MCPServerStdio`][pydantic_ai.mcp.MCPServerStdio], the server will first be started as a subprocess. This is a no-op if the MCP server has already been entered. """ async with self._enter_lock: if self._running_count == 0: async with AsyncExitStack() as exit_stack: self._read_stream, self._write_stream = await exit_stack.enter_async_context(self.client_streams()) client = ClientSession( read_stream=self._read_stream, write_stream=self._write_stream, sampling_callback=self._sampling_callback if self.allow_sampling else None, elicitation_callback=self.elicitation_callback, logging_callback=self.log_handler, read_timeout_seconds=timedelta(seconds=self.read_timeout), ) self._client = await exit_stack.enter_async_context(client) with anyio.fail_after(self.timeout): await self._client.initialize() if log_level := self.log_level: await self._client.set_logging_level(log_level) self._exit_stack = exit_stack.pop_all() self._running_count += 1 return self ``` #### is_running ```python is_running: bool ``` Check if the MCP server is running. ### MCPServerStdio Bases: `MCPServer` Runs an MCP server in a subprocess and communicates with it over stdin/stdout. This class implements the stdio transport from the MCP specification. See for more information. Note Using this class as an async context manager will start the server as a subprocess when entering the context, and stop it when exiting the context. Example: ```python from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerStdio server = MCPServerStdio( # (1)! 'uv', args=['run', 'mcp-run-python', 'stdio'], timeout=10 ) agent = Agent('openai:gpt-4o', toolsets=[server]) async def main(): async with agent: # (2)! ... ``` 1. See [MCP Run Python](https://github.com/pydantic/mcp-run-python) for more information. 1. This will start the server as a subprocess and connect to it. Source code in `pydantic_ai_slim/pydantic_ai/mcp.py` ````python class MCPServerStdio(MCPServer): """Runs an MCP server in a subprocess and communicates with it over stdin/stdout. This class implements the stdio transport from the MCP specification. See for more information. !!! note Using this class as an async context manager will start the server as a subprocess when entering the context, and stop it when exiting the context. Example: ```python {py="3.10"} from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerStdio server = MCPServerStdio( # (1)! 'uv', args=['run', 'mcp-run-python', 'stdio'], timeout=10 ) agent = Agent('openai:gpt-4o', toolsets=[server]) async def main(): async with agent: # (2)! ... ``` 1. See [MCP Run Python](https://github.com/pydantic/mcp-run-python) for more information. 2. This will start the server as a subprocess and connect to it. """ command: str """The command to run.""" args: Sequence[str] """The arguments to pass to the command.""" env: dict[str, str] | None """The environment variables the CLI server will have access to. By default the subprocess will not inherit any environment variables from the parent process. If you want to inherit the environment variables from the parent process, use `env=os.environ`. """ cwd: str | Path | None """The working directory to use when spawning the process.""" # last fields are re-defined from the parent class so they appear as fields tool_prefix: str | None log_level: mcp_types.LoggingLevel | None log_handler: LoggingFnT | None timeout: float read_timeout: float process_tool_call: ProcessToolCallback | None allow_sampling: bool sampling_model: models.Model | None max_retries: int elicitation_callback: ElicitationFnT | None = None def __init__( self, command: str, args: Sequence[str], *, env: dict[str, str] | None = None, cwd: str | Path | None = None, tool_prefix: str | None = None, log_level: mcp_types.LoggingLevel | None = None, log_handler: LoggingFnT | None = None, timeout: float = 5, read_timeout: float = 5 * 60, process_tool_call: ProcessToolCallback | None = None, allow_sampling: bool = True, sampling_model: models.Model | None = None, max_retries: int = 1, elicitation_callback: ElicitationFnT | None = None, id: str | None = None, ): """Build a new MCP server. Args: command: The command to run. args: The arguments to pass to the command. env: The environment variables to set in the subprocess. cwd: The working directory to use when spawning the process. tool_prefix: A prefix to add to all tools that are registered with the server. log_level: The log level to set when connecting to the server, if any. log_handler: A handler for logging messages from the server. timeout: The timeout in seconds to wait for the client to initialize. read_timeout: Maximum time in seconds to wait for new messages before timing out. process_tool_call: Hook to customize tool calling and optionally pass extra metadata. allow_sampling: Whether to allow MCP sampling through this client. sampling_model: The model to use for sampling. max_retries: The maximum number of times to retry a tool call. elicitation_callback: Callback function to handle elicitation requests from the server. id: An optional unique ID for the MCP server. An MCP server needs to have an ID in order to be used in a durable execution environment like Temporal, in which case the ID will be used to identify the server's activities within the workflow. """ self.command = command self.args = args self.env = env self.cwd = cwd super().__init__( tool_prefix, log_level, log_handler, timeout, read_timeout, process_tool_call, allow_sampling, sampling_model, max_retries, elicitation_callback, id=id, ) @classmethod def __get_pydantic_core_schema__(cls, _: Any, __: Any) -> CoreSchema: return core_schema.no_info_after_validator_function( lambda dct: MCPServerStdio(**dct), core_schema.typed_dict_schema( { 'command': core_schema.typed_dict_field(core_schema.str_schema()), 'args': core_schema.typed_dict_field(core_schema.list_schema(core_schema.str_schema())), 'env': core_schema.typed_dict_field( core_schema.dict_schema(core_schema.str_schema(), core_schema.str_schema()), required=False, ), } ), ) @asynccontextmanager async def client_streams( self, ) -> AsyncIterator[ tuple[ MemoryObjectReceiveStream[SessionMessage | Exception], MemoryObjectSendStream[SessionMessage], ] ]: server = StdioServerParameters(command=self.command, args=list(self.args), env=self.env, cwd=self.cwd) async with stdio_client(server=server) as (read_stream, write_stream): yield read_stream, write_stream def __repr__(self) -> str: repr_args = [ f'command={self.command!r}', f'args={self.args!r}', ] if self.id: repr_args.append(f'id={self.id!r}') return f'{self.__class__.__name__}({", ".join(repr_args)})' def __eq__(self, value: object, /) -> bool: if not isinstance(value, MCPServerStdio): return False # pragma: no cover return ( self.command == value.command and self.args == value.args and self.env == value.env and self.cwd == value.cwd ) ```` #### __init__ ```python __init__( command: str, args: Sequence[str], *, env: dict[str, str] | None = None, cwd: str | Path | None = None, tool_prefix: str | None = None, log_level: LoggingLevel | None = None, log_handler: LoggingFnT | None = None, timeout: float = 5, read_timeout: float = 5 * 60, process_tool_call: ProcessToolCallback | None = None, allow_sampling: bool = True, sampling_model: Model | None = None, max_retries: int = 1, elicitation_callback: ElicitationFnT | None = None, id: str | None = None ) ``` Build a new MCP server. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `command` | `str` | The command to run. | *required* | | `args` | `Sequence[str]` | The arguments to pass to the command. | *required* | | `env` | `dict[str, str] | None` | The environment variables to set in the subprocess. | `None` | | `cwd` | `str | Path | None` | The working directory to use when spawning the process. | `None` | | `tool_prefix` | `str | None` | A prefix to add to all tools that are registered with the server. | `None` | | `log_level` | `LoggingLevel | None` | The log level to set when connecting to the server, if any. | `None` | | `log_handler` | `LoggingFnT | None` | A handler for logging messages from the server. | `None` | | `timeout` | `float` | The timeout in seconds to wait for the client to initialize. | `5` | | `read_timeout` | `float` | Maximum time in seconds to wait for new messages before timing out. | `5 * 60` | | `process_tool_call` | `ProcessToolCallback | None` | Hook to customize tool calling and optionally pass extra metadata. | `None` | | `allow_sampling` | `bool` | Whether to allow MCP sampling through this client. | `True` | | `sampling_model` | `Model | None` | The model to use for sampling. | `None` | | `max_retries` | `int` | The maximum number of times to retry a tool call. | `1` | | `elicitation_callback` | `ElicitationFnT | None` | Callback function to handle elicitation requests from the server. | `None` | | `id` | `str | None` | An optional unique ID for the MCP server. An MCP server needs to have an ID in order to be used in a durable execution environment like Temporal, in which case the ID will be used to identify the server's activities within the workflow. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/mcp.py` ```python def __init__( self, command: str, args: Sequence[str], *, env: dict[str, str] | None = None, cwd: str | Path | None = None, tool_prefix: str | None = None, log_level: mcp_types.LoggingLevel | None = None, log_handler: LoggingFnT | None = None, timeout: float = 5, read_timeout: float = 5 * 60, process_tool_call: ProcessToolCallback | None = None, allow_sampling: bool = True, sampling_model: models.Model | None = None, max_retries: int = 1, elicitation_callback: ElicitationFnT | None = None, id: str | None = None, ): """Build a new MCP server. Args: command: The command to run. args: The arguments to pass to the command. env: The environment variables to set in the subprocess. cwd: The working directory to use when spawning the process. tool_prefix: A prefix to add to all tools that are registered with the server. log_level: The log level to set when connecting to the server, if any. log_handler: A handler for logging messages from the server. timeout: The timeout in seconds to wait for the client to initialize. read_timeout: Maximum time in seconds to wait for new messages before timing out. process_tool_call: Hook to customize tool calling and optionally pass extra metadata. allow_sampling: Whether to allow MCP sampling through this client. sampling_model: The model to use for sampling. max_retries: The maximum number of times to retry a tool call. elicitation_callback: Callback function to handle elicitation requests from the server. id: An optional unique ID for the MCP server. An MCP server needs to have an ID in order to be used in a durable execution environment like Temporal, in which case the ID will be used to identify the server's activities within the workflow. """ self.command = command self.args = args self.env = env self.cwd = cwd super().__init__( tool_prefix, log_level, log_handler, timeout, read_timeout, process_tool_call, allow_sampling, sampling_model, max_retries, elicitation_callback, id=id, ) ``` #### command ```python command: str = command ``` The command to run. #### args ```python args: Sequence[str] = args ``` The arguments to pass to the command. #### env ```python env: dict[str, str] | None = env ``` The environment variables the CLI server will have access to. By default the subprocess will not inherit any environment variables from the parent process. If you want to inherit the environment variables from the parent process, use `env=os.environ`. #### cwd ```python cwd: str | Path | None = cwd ``` The working directory to use when spawning the process. ### MCPServerSSE Bases: `_MCPServerHTTP` An MCP server that connects over streamable HTTP connections. This class implements the SSE transport from the MCP specification. See for more information. Note Using this class as an async context manager will create a new pool of HTTP connections to connect to a server which should already be running. Example: ```python from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerSSE server = MCPServerSSE('http://localhost:3001/sse') agent = Agent('openai:gpt-4o', toolsets=[server]) async def main(): async with agent: # (1)! ... ``` 1. This will connect to a server running on `localhost:3001`. Source code in `pydantic_ai_slim/pydantic_ai/mcp.py` ````python class MCPServerSSE(_MCPServerHTTP): """An MCP server that connects over streamable HTTP connections. This class implements the SSE transport from the MCP specification. See for more information. !!! note Using this class as an async context manager will create a new pool of HTTP connections to connect to a server which should already be running. Example: ```python {py="3.10"} from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerSSE server = MCPServerSSE('http://localhost:3001/sse') agent = Agent('openai:gpt-4o', toolsets=[server]) async def main(): async with agent: # (1)! ... ``` 1. This will connect to a server running on `localhost:3001`. """ @classmethod def __get_pydantic_core_schema__(cls, _: Any, __: Any) -> CoreSchema: return core_schema.no_info_after_validator_function( lambda dct: MCPServerSSE(**dct), core_schema.typed_dict_schema( { 'url': core_schema.typed_dict_field(core_schema.str_schema()), 'headers': core_schema.typed_dict_field( core_schema.dict_schema(core_schema.str_schema(), core_schema.str_schema()), required=False ), } ), ) @property def _transport_client(self): return sse_client # pragma: no cover def __eq__(self, value: object, /) -> bool: if not isinstance(value, MCPServerSSE): return False # pragma: no cover return self.url == value.url ```` ### MCPServerHTTP Bases: `MCPServerSSE` Deprecated The `MCPServerHTTP` class is deprecated, use `MCPServerSSE` instead. An MCP server that connects over HTTP using the old SSE transport. This class implements the SSE transport from the MCP specification. See for more information. Note Using this class as an async context manager will create a new pool of HTTP connections to connect to a server which should already be running. Example: ```python from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerHTTP server = MCPServerHTTP('http://localhost:3001/sse') agent = Agent('openai:gpt-4o', toolsets=[server]) async def main(): async with agent: # (2)! ... ``` 1. This will connect to a server running on `localhost:3001`. Source code in `pydantic_ai_slim/pydantic_ai/mcp.py` ````python @deprecated('The `MCPServerHTTP` class is deprecated, use `MCPServerSSE` instead.') class MCPServerHTTP(MCPServerSSE): """An MCP server that connects over HTTP using the old SSE transport. This class implements the SSE transport from the MCP specification. See for more information. !!! note Using this class as an async context manager will create a new pool of HTTP connections to connect to a server which should already be running. Example: ```python {py="3.10" test="skip"} from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerHTTP server = MCPServerHTTP('http://localhost:3001/sse') agent = Agent('openai:gpt-4o', toolsets=[server]) async def main(): async with agent: # (2)! ... ``` 1. This will connect to a server running on `localhost:3001`. """ ```` ### MCPServerStreamableHTTP Bases: `_MCPServerHTTP` An MCP server that connects over HTTP using the Streamable HTTP transport. This class implements the Streamable HTTP transport from the MCP specification. See for more information. Note Using this class as an async context manager will create a new pool of HTTP connections to connect to a server which should already be running. Example: ```python from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerStreamableHTTP server = MCPServerStreamableHTTP('http://localhost:8000/mcp') # (1)! agent = Agent('openai:gpt-4o', toolsets=[server]) async def main(): async with agent: # (2)! ... ``` Source code in `pydantic_ai_slim/pydantic_ai/mcp.py` ````python class MCPServerStreamableHTTP(_MCPServerHTTP): """An MCP server that connects over HTTP using the Streamable HTTP transport. This class implements the Streamable HTTP transport from the MCP specification. See for more information. !!! note Using this class as an async context manager will create a new pool of HTTP connections to connect to a server which should already be running. Example: ```python {py="3.10"} from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerStreamableHTTP server = MCPServerStreamableHTTP('http://localhost:8000/mcp') # (1)! agent = Agent('openai:gpt-4o', toolsets=[server]) async def main(): async with agent: # (2)! ... ``` """ @classmethod def __get_pydantic_core_schema__(cls, _: Any, __: Any) -> CoreSchema: return core_schema.no_info_after_validator_function( lambda dct: MCPServerStreamableHTTP(**dct), core_schema.typed_dict_schema( { 'url': core_schema.typed_dict_field(core_schema.str_schema()), 'headers': core_schema.typed_dict_field( core_schema.dict_schema(core_schema.str_schema(), core_schema.str_schema()), required=False ), } ), ) @property def _transport_client(self): return streamablehttp_client # pragma: no cover def __eq__(self, value: object, /) -> bool: if not isinstance(value, MCPServerStreamableHTTP): return False # pragma: no cover return self.url == value.url ```` ### load_mcp_servers ```python load_mcp_servers( config_path: str | Path, ) -> list[ MCPServerStdio | MCPServerStreamableHTTP | MCPServerSSE ] ``` Load MCP servers from a configuration file. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `config_path` | `str | Path` | The path to the configuration file. | *required* | Returns: | Type | Description | | --- | --- | | `list[MCPServerStdio | MCPServerStreamableHTTP | MCPServerSSE]` | A list of MCP servers. | Raises: | Type | Description | | --- | --- | | `FileNotFoundError` | If the configuration file does not exist. | | `ValidationError` | If the configuration file does not match the schema. | Source code in `pydantic_ai_slim/pydantic_ai/mcp.py` ```python def load_mcp_servers(config_path: str | Path) -> list[MCPServerStdio | MCPServerStreamableHTTP | MCPServerSSE]: """Load MCP servers from a configuration file. Args: config_path: The path to the configuration file. Returns: A list of MCP servers. Raises: FileNotFoundError: If the configuration file does not exist. ValidationError: If the configuration file does not match the schema. """ config_path = Path(config_path) if not config_path.exists(): raise FileNotFoundError(f'Config file {config_path} not found') config = MCPServerConfig.model_validate_json(config_path.read_bytes()) return list(config.mcp_servers.values()) ``` # `pydantic_ai.messages` The structure of ModelMessage can be shown as a graph: ``` graph RL SystemPromptPart(SystemPromptPart) --- ModelRequestPart UserPromptPart(UserPromptPart) --- ModelRequestPart ToolReturnPart(ToolReturnPart) --- ModelRequestPart RetryPromptPart(RetryPromptPart) --- ModelRequestPart TextPart(TextPart) --- ModelResponsePart ToolCallPart(ToolCallPart) --- ModelResponsePart ThinkingPart(ThinkingPart) --- ModelResponsePart ModelRequestPart("ModelRequestPart
(Union)") --- ModelRequest ModelRequest("ModelRequest(parts=list[...])") --- ModelMessage ModelResponsePart("ModelResponsePart
(Union)") --- ModelResponse ModelResponse("ModelResponse(parts=list[...])") --- ModelMessage("ModelMessage
(Union)") ``` ### FinishReason ```python FinishReason: TypeAlias = Literal[ "stop", "length", "content_filter", "tool_call", "error" ] ``` Reason the model finished generating the response, normalized to OpenTelemetry values. ### SystemPromptPart A system prompt, generally written by the application developer. This gives the model context and guidance on how to respond. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class SystemPromptPart: """A system prompt, generally written by the application developer. This gives the model context and guidance on how to respond. """ content: str """The content of the prompt.""" _: KW_ONLY timestamp: datetime = field(default_factory=_now_utc) """The timestamp of the prompt.""" dynamic_ref: str | None = None """The ref of the dynamic system prompt function that generated this part. Only set if system prompt is dynamic, see [`system_prompt`][pydantic_ai.Agent.system_prompt] for more information. """ part_kind: Literal['system-prompt'] = 'system-prompt' """Part type identifier, this is available on all parts as a discriminator.""" def otel_event(self, settings: InstrumentationSettings) -> Event: return Event( 'gen_ai.system.message', body={'role': 'system', **({'content': self.content} if settings.include_content else {})}, ) def otel_message_parts(self, settings: InstrumentationSettings) -> list[_otel_messages.MessagePart]: return [_otel_messages.TextPart(type='text', **{'content': self.content} if settings.include_content else {})] __repr__ = _utils.dataclasses_no_defaults_repr ``` #### content ```python content: str ``` The content of the prompt. #### timestamp ```python timestamp: datetime = field(default_factory=now_utc) ``` The timestamp of the prompt. #### dynamic_ref ```python dynamic_ref: str | None = None ``` The ref of the dynamic system prompt function that generated this part. Only set if system prompt is dynamic, see system_prompt for more information. #### part_kind ```python part_kind: Literal['system-prompt'] = 'system-prompt' ``` Part type identifier, this is available on all parts as a discriminator. ### FileUrl Bases: `ABC` Abstract base class for any URL-based file. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(init=False, repr=False) class FileUrl(ABC): """Abstract base class for any URL-based file.""" url: str """The URL of the file.""" _: KW_ONLY force_download: bool = False """If the model supports it: * If True, the file is downloaded and the data is sent to the model as bytes. * If False, the URL is sent directly to the model and no download is performed. """ vendor_metadata: dict[str, Any] | None = None """Vendor-specific metadata for the file. Supported by: - `GoogleModel`: `VideoUrl.vendor_metadata` is used as `video_metadata`: https://ai.google.dev/gemini-api/docs/video-understanding#customize-video-processing """ _media_type: Annotated[str | None, pydantic.Field(alias='media_type', default=None, exclude=True)] = field( compare=False, default=None ) identifier: str | None = None """The identifier of the file, such as a unique ID. generating one from the url if not explicitly set This identifier can be provided to the model in a message to allow it to refer to this file in a tool call argument, and the tool can look up the file in question by iterating over the message history and finding the matching `FileUrl`. This identifier is only automatically passed to the model when the `FileUrl` is returned by a tool. If you're passing the `FileUrl` as a user message, it's up to you to include a separate text part with the identifier, e.g. "This is file :" preceding the `FileUrl`. """ def __init__( self, url: str, *, force_download: bool = False, vendor_metadata: dict[str, Any] | None = None, media_type: str | None = None, identifier: str | None = None, ) -> None: self.url = url self.force_download = force_download self.vendor_metadata = vendor_metadata self._media_type = media_type self.identifier = identifier or _multi_modal_content_identifier(url) @pydantic.computed_field @property def media_type(self) -> str: """Return the media type of the file, based on the URL or the provided `media_type`.""" return self._media_type or self._infer_media_type() @abstractmethod def _infer_media_type(self) -> str: """Infer the media type of the file based on the URL.""" raise NotImplementedError @property @abstractmethod def format(self) -> str: """The file format.""" raise NotImplementedError __repr__ = _utils.dataclasses_no_defaults_repr ``` #### url ```python url: str = url ``` The URL of the file. #### force_download ```python force_download: bool = force_download ``` If the model supports it: - If True, the file is downloaded and the data is sent to the model as bytes. - If False, the URL is sent directly to the model and no download is performed. #### vendor_metadata ```python vendor_metadata: dict[str, Any] | None = vendor_metadata ``` Vendor-specific metadata for the file. Supported by: - `GoogleModel`: `VideoUrl.vendor_metadata` is used as `video_metadata`: https://ai.google.dev/gemini-api/docs/video-understanding#customize-video-processing #### identifier ```python identifier: str | None = ( identifier or _multi_modal_content_identifier(url) ) ``` The identifier of the file, such as a unique ID. generating one from the url if not explicitly set This identifier can be provided to the model in a message to allow it to refer to this file in a tool call argument, and the tool can look up the file in question by iterating over the message history and finding the matching `FileUrl`. This identifier is only automatically passed to the model when the `FileUrl` is returned by a tool. If you're passing the `FileUrl` as a user message, it's up to you to include a separate text part with the identifier, e.g. "This is file :" preceding the `FileUrl`. #### media_type ```python media_type: str ``` Return the media type of the file, based on the URL or the provided `media_type`. #### format ```python format: str ``` The file format. ### VideoUrl Bases: `FileUrl` A URL to a video. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(init=False, repr=False) class VideoUrl(FileUrl): """A URL to a video.""" url: str """The URL of the video.""" _: KW_ONLY kind: Literal['video-url'] = 'video-url' """Type identifier, this is available on all parts as a discriminator.""" def __init__( self, url: str, *, force_download: bool = False, vendor_metadata: dict[str, Any] | None = None, media_type: str | None = None, kind: Literal['video-url'] = 'video-url', identifier: str | None = None, # Required for inline-snapshot which expects all dataclass `__init__` methods to take all field names as kwargs. _media_type: str | None = None, ) -> None: super().__init__( url=url, force_download=force_download, vendor_metadata=vendor_metadata, media_type=media_type or _media_type, identifier=identifier, ) self.kind = kind def _infer_media_type(self) -> VideoMediaType: """Return the media type of the video, based on the url.""" if self.url.endswith('.mkv'): return 'video/x-matroska' elif self.url.endswith('.mov'): return 'video/quicktime' elif self.url.endswith('.mp4'): return 'video/mp4' elif self.url.endswith('.webm'): return 'video/webm' elif self.url.endswith('.flv'): return 'video/x-flv' elif self.url.endswith(('.mpeg', '.mpg')): return 'video/mpeg' elif self.url.endswith('.wmv'): return 'video/x-ms-wmv' elif self.url.endswith('.three_gp'): return 'video/3gpp' # Assume that YouTube videos are mp4 because there would be no extension # to infer from. This should not be a problem, as Gemini disregards media # type for YouTube URLs. elif self.is_youtube: return 'video/mp4' else: raise ValueError( f'Could not infer media type from video URL: {self.url}. Explicitly provide a `media_type` instead.' ) @property def is_youtube(self) -> bool: """True if the URL has a YouTube domain.""" return self.url.startswith(('https://youtu.be/', 'https://youtube.com/', 'https://www.youtube.com/')) @property def format(self) -> VideoFormat: """The file format of the video. The choice of supported formats were based on the Bedrock Converse API. Other APIs don't require to use a format. """ return _video_format_lookup[self.media_type] ``` #### url ```python url: str ``` The URL of the video. #### kind ```python kind: Literal['video-url'] = kind ``` Type identifier, this is available on all parts as a discriminator. #### is_youtube ```python is_youtube: bool ``` True if the URL has a YouTube domain. #### format ```python format: VideoFormat ``` The file format of the video. The choice of supported formats were based on the Bedrock Converse API. Other APIs don't require to use a format. ### AudioUrl Bases: `FileUrl` A URL to an audio file. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(init=False, repr=False) class AudioUrl(FileUrl): """A URL to an audio file.""" url: str """The URL of the audio file.""" _: KW_ONLY kind: Literal['audio-url'] = 'audio-url' """Type identifier, this is available on all parts as a discriminator.""" def __init__( self, url: str, *, force_download: bool = False, vendor_metadata: dict[str, Any] | None = None, media_type: str | None = None, kind: Literal['audio-url'] = 'audio-url', identifier: str | None = None, # Required for inline-snapshot which expects all dataclass `__init__` methods to take all field names as kwargs. _media_type: str | None = None, ) -> None: super().__init__( url=url, force_download=force_download, vendor_metadata=vendor_metadata, media_type=media_type or _media_type, identifier=identifier, ) self.kind = kind def _infer_media_type(self) -> AudioMediaType: """Return the media type of the audio file, based on the url. References: - Gemini: https://ai.google.dev/gemini-api/docs/audio#supported-formats """ if self.url.endswith('.mp3'): return 'audio/mpeg' if self.url.endswith('.wav'): return 'audio/wav' if self.url.endswith('.flac'): return 'audio/flac' if self.url.endswith('.oga'): return 'audio/ogg' if self.url.endswith('.aiff'): return 'audio/aiff' if self.url.endswith('.aac'): return 'audio/aac' raise ValueError( f'Could not infer media type from audio URL: {self.url}. Explicitly provide a `media_type` instead.' ) @property def format(self) -> AudioFormat: """The file format of the audio file.""" return _audio_format_lookup[self.media_type] ``` #### url ```python url: str ``` The URL of the audio file. #### kind ```python kind: Literal['audio-url'] = kind ``` Type identifier, this is available on all parts as a discriminator. #### format ```python format: AudioFormat ``` The file format of the audio file. ### ImageUrl Bases: `FileUrl` A URL to an image. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(init=False, repr=False) class ImageUrl(FileUrl): """A URL to an image.""" url: str """The URL of the image.""" _: KW_ONLY kind: Literal['image-url'] = 'image-url' """Type identifier, this is available on all parts as a discriminator.""" def __init__( self, url: str, *, force_download: bool = False, vendor_metadata: dict[str, Any] | None = None, media_type: str | None = None, kind: Literal['image-url'] = 'image-url', identifier: str | None = None, # Required for inline-snapshot which expects all dataclass `__init__` methods to take all field names as kwargs. _media_type: str | None = None, ) -> None: super().__init__( url=url, force_download=force_download, vendor_metadata=vendor_metadata, media_type=media_type or _media_type, identifier=identifier, ) self.kind = kind def _infer_media_type(self) -> ImageMediaType: """Return the media type of the image, based on the url.""" if self.url.endswith(('.jpg', '.jpeg')): return 'image/jpeg' elif self.url.endswith('.png'): return 'image/png' elif self.url.endswith('.gif'): return 'image/gif' elif self.url.endswith('.webp'): return 'image/webp' else: raise ValueError( f'Could not infer media type from image URL: {self.url}. Explicitly provide a `media_type` instead.' ) @property def format(self) -> ImageFormat: """The file format of the image. The choice of supported formats were based on the Bedrock Converse API. Other APIs don't require to use a format. """ return _image_format_lookup[self.media_type] ``` #### url ```python url: str ``` The URL of the image. #### kind ```python kind: Literal['image-url'] = kind ``` Type identifier, this is available on all parts as a discriminator. #### format ```python format: ImageFormat ``` The file format of the image. The choice of supported formats were based on the Bedrock Converse API. Other APIs don't require to use a format. ### DocumentUrl Bases: `FileUrl` The URL of the document. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(init=False, repr=False) class DocumentUrl(FileUrl): """The URL of the document.""" url: str """The URL of the document.""" _: KW_ONLY kind: Literal['document-url'] = 'document-url' """Type identifier, this is available on all parts as a discriminator.""" def __init__( self, url: str, *, force_download: bool = False, vendor_metadata: dict[str, Any] | None = None, media_type: str | None = None, kind: Literal['document-url'] = 'document-url', identifier: str | None = None, # Required for inline-snapshot which expects all dataclass `__init__` methods to take all field names as kwargs. _media_type: str | None = None, ) -> None: super().__init__( url=url, force_download=force_download, vendor_metadata=vendor_metadata, media_type=media_type or _media_type, identifier=identifier, ) self.kind = kind def _infer_media_type(self) -> str: """Return the media type of the document, based on the url.""" # Common document types are hardcoded here as mime-type support for these # extensions varies across operating systems. if self.url.endswith(('.md', '.mdx', '.markdown')): return 'text/markdown' elif self.url.endswith('.asciidoc'): return 'text/x-asciidoc' elif self.url.endswith('.txt'): return 'text/plain' elif self.url.endswith('.pdf'): return 'application/pdf' elif self.url.endswith('.rtf'): return 'application/rtf' elif self.url.endswith('.docx'): return 'application/vnd.openxmlformats-officedocument.wordprocessingml.document' elif self.url.endswith('.xlsx'): return 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet' type_, _ = guess_type(self.url) if type_ is None: raise ValueError( f'Could not infer media type from document URL: {self.url}. Explicitly provide a `media_type` instead.' ) return type_ @property def format(self) -> DocumentFormat: """The file format of the document. The choice of supported formats were based on the Bedrock Converse API. Other APIs don't require to use a format. """ media_type = self.media_type try: return _document_format_lookup[media_type] except KeyError as e: raise ValueError(f'Unknown document media type: {media_type}') from e ``` #### url ```python url: str ``` The URL of the document. #### kind ```python kind: Literal['document-url'] = kind ``` Type identifier, this is available on all parts as a discriminator. #### format ```python format: DocumentFormat ``` The file format of the document. The choice of supported formats were based on the Bedrock Converse API. Other APIs don't require to use a format. ### BinaryContent Binary content, e.g. an audio or image file. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(init=False, repr=False) class BinaryContent: """Binary content, e.g. an audio or image file.""" data: bytes """The binary data.""" _: KW_ONLY media_type: AudioMediaType | ImageMediaType | DocumentMediaType | str """The media type of the binary data.""" identifier: str """Identifier for the binary content, such as a unique ID. generating one from the data if not explicitly set This identifier can be provided to the model in a message to allow it to refer to this file in a tool call argument, and the tool can look up the file in question by iterating over the message history and finding the matching `BinaryContent`. This identifier is only automatically passed to the model when the `BinaryContent` is returned by a tool. If you're passing the `BinaryContent` as a user message, it's up to you to include a separate text part with the identifier, e.g. "This is file :" preceding the `BinaryContent`. """ vendor_metadata: dict[str, Any] | None = None """Vendor-specific metadata for the file. Supported by: - `GoogleModel`: `BinaryContent.vendor_metadata` is used as `video_metadata`: https://ai.google.dev/gemini-api/docs/video-understanding#customize-video-processing """ kind: Literal['binary'] = 'binary' """Type identifier, this is available on all parts as a discriminator.""" def __init__( self, data: bytes, *, media_type: AudioMediaType | ImageMediaType | DocumentMediaType | str, identifier: str | None = None, vendor_metadata: dict[str, Any] | None = None, kind: Literal['binary'] = 'binary', ) -> None: self.data = data self.media_type = media_type self.identifier = identifier or _multi_modal_content_identifier(data) self.vendor_metadata = vendor_metadata self.kind = kind @property def is_audio(self) -> bool: """Return `True` if the media type is an audio type.""" return self.media_type.startswith('audio/') @property def is_image(self) -> bool: """Return `True` if the media type is an image type.""" return self.media_type.startswith('image/') @property def is_video(self) -> bool: """Return `True` if the media type is a video type.""" return self.media_type.startswith('video/') @property def is_document(self) -> bool: """Return `True` if the media type is a document type.""" return self.media_type in _document_format_lookup @property def format(self) -> str: """The file format of the binary content.""" try: if self.is_audio: return _audio_format_lookup[self.media_type] elif self.is_image: return _image_format_lookup[self.media_type] elif self.is_video: return _video_format_lookup[self.media_type] else: return _document_format_lookup[self.media_type] except KeyError as e: raise ValueError(f'Unknown media type: {self.media_type}') from e __repr__ = _utils.dataclasses_no_defaults_repr ``` #### data ```python data: bytes = data ``` The binary data. #### media_type ```python media_type: ( AudioMediaType | ImageMediaType | DocumentMediaType | str ) = media_type ``` The media type of the binary data. #### identifier ```python identifier: str = ( identifier or _multi_modal_content_identifier(data) ) ``` Identifier for the binary content, such as a unique ID. generating one from the data if not explicitly set This identifier can be provided to the model in a message to allow it to refer to this file in a tool call argument, and the tool can look up the file in question by iterating over the message history and finding the matching `BinaryContent`. This identifier is only automatically passed to the model when the `BinaryContent` is returned by a tool. If you're passing the `BinaryContent` as a user message, it's up to you to include a separate text part with the identifier, e.g. "This is file :" preceding the `BinaryContent`. #### vendor_metadata ```python vendor_metadata: dict[str, Any] | None = vendor_metadata ``` Vendor-specific metadata for the file. Supported by: - `GoogleModel`: `BinaryContent.vendor_metadata` is used as `video_metadata`: https://ai.google.dev/gemini-api/docs/video-understanding#customize-video-processing #### kind ```python kind: Literal['binary'] = kind ``` Type identifier, this is available on all parts as a discriminator. #### is_audio ```python is_audio: bool ``` Return `True` if the media type is an audio type. #### is_image ```python is_image: bool ``` Return `True` if the media type is an image type. #### is_video ```python is_video: bool ``` Return `True` if the media type is a video type. #### is_document ```python is_document: bool ``` Return `True` if the media type is a document type. #### format ```python format: str ``` The file format of the binary content. ### ToolReturn A structured return value for tools that need to provide both a return value and custom content to the model. This class allows tools to return complex responses that include: - A return value for actual tool return - Custom content (including multi-modal content) to be sent to the model as a UserPromptPart - Optional metadata for application use Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class ToolReturn: """A structured return value for tools that need to provide both a return value and custom content to the model. This class allows tools to return complex responses that include: - A return value for actual tool return - Custom content (including multi-modal content) to be sent to the model as a UserPromptPart - Optional metadata for application use """ return_value: Any """The return value to be used in the tool response.""" _: KW_ONLY content: str | Sequence[UserContent] | None = None """The content to be sent to the model as a UserPromptPart.""" metadata: Any = None """Additional data that can be accessed programmatically by the application but is not sent to the LLM.""" kind: Literal['tool-return'] = 'tool-return' __repr__ = _utils.dataclasses_no_defaults_repr ``` #### return_value ```python return_value: Any ``` The return value to be used in the tool response. #### content ```python content: str | Sequence[UserContent] | None = None ``` The content to be sent to the model as a UserPromptPart. #### metadata ```python metadata: Any = None ``` Additional data that can be accessed programmatically by the application but is not sent to the LLM. ### UserPromptPart A user prompt, generally written by the end user. Content comes from the `user_prompt` parameter of Agent.run, Agent.run_sync, and Agent.run_stream. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class UserPromptPart: """A user prompt, generally written by the end user. Content comes from the `user_prompt` parameter of [`Agent.run`][pydantic_ai.agent.AbstractAgent.run], [`Agent.run_sync`][pydantic_ai.agent.AbstractAgent.run_sync], and [`Agent.run_stream`][pydantic_ai.agent.AbstractAgent.run_stream]. """ content: str | Sequence[UserContent] """The content of the prompt.""" _: KW_ONLY timestamp: datetime = field(default_factory=_now_utc) """The timestamp of the prompt.""" part_kind: Literal['user-prompt'] = 'user-prompt' """Part type identifier, this is available on all parts as a discriminator.""" def otel_event(self, settings: InstrumentationSettings) -> Event: content = [{'kind': part.pop('type'), **part} for part in self.otel_message_parts(settings)] for part in content: if part['kind'] == 'binary' and 'content' in part: part['binary_content'] = part.pop('content') content = [ part['content'] if part == {'kind': 'text', 'content': part.get('content')} else part for part in content ] if content in ([{'kind': 'text'}], [self.content]): content = content[0] return Event('gen_ai.user.message', body={'content': content, 'role': 'user'}) def otel_message_parts(self, settings: InstrumentationSettings) -> list[_otel_messages.MessagePart]: parts: list[_otel_messages.MessagePart] = [] content: Sequence[UserContent] = [self.content] if isinstance(self.content, str) else self.content for part in content: if isinstance(part, str): parts.append( _otel_messages.TextPart(type='text', **({'content': part} if settings.include_content else {})) ) elif isinstance(part, ImageUrl | AudioUrl | DocumentUrl | VideoUrl): parts.append( _otel_messages.MediaUrlPart( type=part.kind, **{'url': part.url} if settings.include_content else {}, ) ) elif isinstance(part, BinaryContent): converted_part = _otel_messages.BinaryDataPart(type='binary', media_type=part.media_type) if settings.include_content and settings.include_binary_content: converted_part['content'] = base64.b64encode(part.data).decode() parts.append(converted_part) else: parts.append({'type': part.kind}) # pragma: no cover return parts __repr__ = _utils.dataclasses_no_defaults_repr ``` #### content ```python content: str | Sequence[UserContent] ``` The content of the prompt. #### timestamp ```python timestamp: datetime = field(default_factory=now_utc) ``` The timestamp of the prompt. #### part_kind ```python part_kind: Literal['user-prompt'] = 'user-prompt' ``` Part type identifier, this is available on all parts as a discriminator. ### BaseToolReturnPart Base class for tool return parts. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class BaseToolReturnPart: """Base class for tool return parts.""" tool_name: str """The name of the "tool" was called.""" content: Any """The return value.""" tool_call_id: str """The tool call identifier, this is used by some models including OpenAI.""" _: KW_ONLY metadata: Any = None """Additional data that can be accessed programmatically by the application but is not sent to the LLM.""" timestamp: datetime = field(default_factory=_now_utc) """The timestamp, when the tool returned.""" def model_response_str(self) -> str: """Return a string representation of the content for the model.""" if isinstance(self.content, str): return self.content else: return tool_return_ta.dump_json(self.content).decode() def model_response_object(self) -> dict[str, Any]: """Return a dictionary representation of the content, wrapping non-dict types appropriately.""" # gemini supports JSON dict return values, but no other JSON types, hence we wrap anything else in a dict if isinstance(self.content, dict): return tool_return_ta.dump_python(self.content, mode='json') # pyright: ignore[reportUnknownMemberType] else: return {'return_value': tool_return_ta.dump_python(self.content, mode='json')} def otel_event(self, settings: InstrumentationSettings) -> Event: return Event( 'gen_ai.tool.message', body={ **({'content': self.content} if settings.include_content else {}), 'role': 'tool', 'id': self.tool_call_id, 'name': self.tool_name, }, ) def otel_message_parts(self, settings: InstrumentationSettings) -> list[_otel_messages.MessagePart]: from .models.instrumented import InstrumentedModel return [ _otel_messages.ToolCallResponsePart( type='tool_call_response', id=self.tool_call_id, name=self.tool_name, **({'result': InstrumentedModel.serialize_any(self.content)} if settings.include_content else {}), ) ] def has_content(self) -> bool: """Return `True` if the tool return has content.""" return self.content is not None # pragma: no cover __repr__ = _utils.dataclasses_no_defaults_repr ``` #### tool_name ```python tool_name: str ``` The name of the "tool" was called. #### content ```python content: Any ``` The return value. #### tool_call_id ```python tool_call_id: str ``` The tool call identifier, this is used by some models including OpenAI. #### metadata ```python metadata: Any = None ``` Additional data that can be accessed programmatically by the application but is not sent to the LLM. #### timestamp ```python timestamp: datetime = field(default_factory=now_utc) ``` The timestamp, when the tool returned. #### model_response_str ```python model_response_str() -> str ``` Return a string representation of the content for the model. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python def model_response_str(self) -> str: """Return a string representation of the content for the model.""" if isinstance(self.content, str): return self.content else: return tool_return_ta.dump_json(self.content).decode() ``` #### model_response_object ```python model_response_object() -> dict[str, Any] ``` Return a dictionary representation of the content, wrapping non-dict types appropriately. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python def model_response_object(self) -> dict[str, Any]: """Return a dictionary representation of the content, wrapping non-dict types appropriately.""" # gemini supports JSON dict return values, but no other JSON types, hence we wrap anything else in a dict if isinstance(self.content, dict): return tool_return_ta.dump_python(self.content, mode='json') # pyright: ignore[reportUnknownMemberType] else: return {'return_value': tool_return_ta.dump_python(self.content, mode='json')} ``` #### has_content ```python has_content() -> bool ``` Return `True` if the tool return has content. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python def has_content(self) -> bool: """Return `True` if the tool return has content.""" return self.content is not None # pragma: no cover ``` ### ToolReturnPart Bases: `BaseToolReturnPart` A tool return message, this encodes the result of running a tool. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class ToolReturnPart(BaseToolReturnPart): """A tool return message, this encodes the result of running a tool.""" _: KW_ONLY part_kind: Literal['tool-return'] = 'tool-return' """Part type identifier, this is available on all parts as a discriminator.""" ``` #### part_kind ```python part_kind: Literal['tool-return'] = 'tool-return' ``` Part type identifier, this is available on all parts as a discriminator. ### BuiltinToolReturnPart Bases: `BaseToolReturnPart` A tool return message from a built-in tool. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class BuiltinToolReturnPart(BaseToolReturnPart): """A tool return message from a built-in tool.""" _: KW_ONLY provider_name: str | None = None """The name of the provider that generated the response.""" part_kind: Literal['builtin-tool-return'] = 'builtin-tool-return' """Part type identifier, this is available on all parts as a discriminator.""" ``` #### provider_name ```python provider_name: str | None = None ``` The name of the provider that generated the response. #### part_kind ```python part_kind: Literal["builtin-tool-return"] = ( "builtin-tool-return" ) ``` Part type identifier, this is available on all parts as a discriminator. ### RetryPromptPart A message back to a model asking it to try again. This can be sent for a number of reasons: - Pydantic validation of tool arguments failed, here content is derived from a Pydantic ValidationError - a tool raised a ModelRetry exception - no tool was found for the tool name - the model returned plain text when a structured response was expected - Pydantic validation of a structured response failed, here content is derived from a Pydantic ValidationError - an output validator raised a ModelRetry exception Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class RetryPromptPart: """A message back to a model asking it to try again. This can be sent for a number of reasons: * Pydantic validation of tool arguments failed, here content is derived from a Pydantic [`ValidationError`][pydantic_core.ValidationError] * a tool raised a [`ModelRetry`][pydantic_ai.exceptions.ModelRetry] exception * no tool was found for the tool name * the model returned plain text when a structured response was expected * Pydantic validation of a structured response failed, here content is derived from a Pydantic [`ValidationError`][pydantic_core.ValidationError] * an output validator raised a [`ModelRetry`][pydantic_ai.exceptions.ModelRetry] exception """ content: list[pydantic_core.ErrorDetails] | str """Details of why and how the model should retry. If the retry was triggered by a [`ValidationError`][pydantic_core.ValidationError], this will be a list of error details. """ _: KW_ONLY tool_name: str | None = None """The name of the tool that was called, if any.""" tool_call_id: str = field(default_factory=_generate_tool_call_id) """The tool call identifier, this is used by some models including OpenAI. In case the tool call id is not provided by the model, Pydantic AI will generate a random one. """ timestamp: datetime = field(default_factory=_now_utc) """The timestamp, when the retry was triggered.""" part_kind: Literal['retry-prompt'] = 'retry-prompt' """Part type identifier, this is available on all parts as a discriminator.""" def model_response(self) -> str: """Return a string message describing why the retry is requested.""" if isinstance(self.content, str): if self.tool_name is None: description = f'Validation feedback:\n{self.content}' else: description = self.content else: json_errors = error_details_ta.dump_json(self.content, exclude={'__all__': {'ctx'}}, indent=2) description = f'{len(self.content)} validation errors: {json_errors.decode()}' return f'{description}\n\nFix the errors and try again.' def otel_event(self, settings: InstrumentationSettings) -> Event: if self.tool_name is None: return Event('gen_ai.user.message', body={'content': self.model_response(), 'role': 'user'}) else: return Event( 'gen_ai.tool.message', body={ **({'content': self.model_response()} if settings.include_content else {}), 'role': 'tool', 'id': self.tool_call_id, 'name': self.tool_name, }, ) def otel_message_parts(self, settings: InstrumentationSettings) -> list[_otel_messages.MessagePart]: if self.tool_name is None: return [_otel_messages.TextPart(type='text', content=self.model_response())] else: return [ _otel_messages.ToolCallResponsePart( type='tool_call_response', id=self.tool_call_id, name=self.tool_name, **({'result': self.model_response()} if settings.include_content else {}), ) ] __repr__ = _utils.dataclasses_no_defaults_repr ``` #### content ```python content: list[ErrorDetails] | str ``` Details of why and how the model should retry. If the retry was triggered by a ValidationError, this will be a list of error details. #### tool_name ```python tool_name: str | None = None ``` The name of the tool that was called, if any. #### tool_call_id ```python tool_call_id: str = field( default_factory=generate_tool_call_id ) ``` The tool call identifier, this is used by some models including OpenAI. In case the tool call id is not provided by the model, Pydantic AI will generate a random one. #### timestamp ```python timestamp: datetime = field(default_factory=now_utc) ``` The timestamp, when the retry was triggered. #### part_kind ```python part_kind: Literal['retry-prompt'] = 'retry-prompt' ``` Part type identifier, this is available on all parts as a discriminator. #### model_response ```python model_response() -> str ``` Return a string message describing why the retry is requested. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python def model_response(self) -> str: """Return a string message describing why the retry is requested.""" if isinstance(self.content, str): if self.tool_name is None: description = f'Validation feedback:\n{self.content}' else: description = self.content else: json_errors = error_details_ta.dump_json(self.content, exclude={'__all__': {'ctx'}}, indent=2) description = f'{len(self.content)} validation errors: {json_errors.decode()}' return f'{description}\n\nFix the errors and try again.' ``` ### ModelRequestPart ```python ModelRequestPart = Annotated[ SystemPromptPart | UserPromptPart | ToolReturnPart | RetryPromptPart, Discriminator("part_kind"), ] ``` A message part sent by Pydantic AI to a model. ### ModelRequest A request generated by Pydantic AI and sent to a model, e.g. a message from the Pydantic AI app to the model. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class ModelRequest: """A request generated by Pydantic AI and sent to a model, e.g. a message from the Pydantic AI app to the model.""" parts: Sequence[ModelRequestPart] """The parts of the user message.""" _: KW_ONLY instructions: str | None = None """The instructions for the model.""" kind: Literal['request'] = 'request' """Message type identifier, this is available on all parts as a discriminator.""" @classmethod def user_text_prompt(cls, user_prompt: str, *, instructions: str | None = None) -> ModelRequest: """Create a `ModelRequest` with a single user prompt as text.""" return cls(parts=[UserPromptPart(user_prompt)], instructions=instructions) __repr__ = _utils.dataclasses_no_defaults_repr ``` #### parts ```python parts: Sequence[ModelRequestPart] ``` The parts of the user message. #### instructions ```python instructions: str | None = None ``` The instructions for the model. #### kind ```python kind: Literal['request'] = 'request' ``` Message type identifier, this is available on all parts as a discriminator. #### user_text_prompt ```python user_text_prompt( user_prompt: str, *, instructions: str | None = None ) -> ModelRequest ``` Create a `ModelRequest` with a single user prompt as text. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @classmethod def user_text_prompt(cls, user_prompt: str, *, instructions: str | None = None) -> ModelRequest: """Create a `ModelRequest` with a single user prompt as text.""" return cls(parts=[UserPromptPart(user_prompt)], instructions=instructions) ``` ### TextPart A plain text response from a model. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class TextPart: """A plain text response from a model.""" content: str """The text content of the response.""" _: KW_ONLY id: str | None = None """An optional identifier of the text part.""" part_kind: Literal['text'] = 'text' """Part type identifier, this is available on all parts as a discriminator.""" def has_content(self) -> bool: """Return `True` if the text content is non-empty.""" return bool(self.content) __repr__ = _utils.dataclasses_no_defaults_repr ``` #### content ```python content: str ``` The text content of the response. #### id ```python id: str | None = None ``` An optional identifier of the text part. #### part_kind ```python part_kind: Literal['text'] = 'text' ``` Part type identifier, this is available on all parts as a discriminator. #### has_content ```python has_content() -> bool ``` Return `True` if the text content is non-empty. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python def has_content(self) -> bool: """Return `True` if the text content is non-empty.""" return bool(self.content) ``` ### ThinkingPart A thinking response from a model. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class ThinkingPart: """A thinking response from a model.""" content: str """The thinking content of the response.""" _: KW_ONLY id: str | None = None """The identifier of the thinking part.""" signature: str | None = None """The signature of the thinking. Supported by: * Anthropic (corresponds to the `signature` field) * Bedrock (corresponds to the `signature` field) * Google (corresponds to the `thought_signature` field) * OpenAI (corresponds to the `encrypted_content` field) """ provider_name: str | None = None """The name of the provider that generated the response. Signatures are only sent back to the same provider. """ part_kind: Literal['thinking'] = 'thinking' """Part type identifier, this is available on all parts as a discriminator.""" def has_content(self) -> bool: """Return `True` if the thinking content is non-empty.""" return bool(self.content) __repr__ = _utils.dataclasses_no_defaults_repr ``` #### content ```python content: str ``` The thinking content of the response. #### id ```python id: str | None = None ``` The identifier of the thinking part. #### signature ```python signature: str | None = None ``` The signature of the thinking. Supported by: - Anthropic (corresponds to the `signature` field) - Bedrock (corresponds to the `signature` field) - Google (corresponds to the `thought_signature` field) - OpenAI (corresponds to the `encrypted_content` field) #### provider_name ```python provider_name: str | None = None ``` The name of the provider that generated the response. Signatures are only sent back to the same provider. #### part_kind ```python part_kind: Literal['thinking'] = 'thinking' ``` Part type identifier, this is available on all parts as a discriminator. #### has_content ```python has_content() -> bool ``` Return `True` if the thinking content is non-empty. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python def has_content(self) -> bool: """Return `True` if the thinking content is non-empty.""" return bool(self.content) ``` ### BaseToolCallPart A tool call from a model. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class BaseToolCallPart: """A tool call from a model.""" tool_name: str """The name of the tool to call.""" args: str | dict[str, Any] | None = None """The arguments to pass to the tool. This is stored either as a JSON string or a Python dictionary depending on how data was received. """ tool_call_id: str = field(default_factory=_generate_tool_call_id) """The tool call identifier, this is used by some models including OpenAI. In case the tool call id is not provided by the model, Pydantic AI will generate a random one. """ def args_as_dict(self) -> dict[str, Any]: """Return the arguments as a Python dictionary. This is just for convenience with models that require dicts as input. """ if not self.args: return {} if isinstance(self.args, dict): return self.args args = pydantic_core.from_json(self.args) assert isinstance(args, dict), 'args should be a dict' return cast(dict[str, Any], args) def args_as_json_str(self) -> str: """Return the arguments as a JSON string. This is just for convenience with models that require JSON strings as input. """ if not self.args: return '{}' if isinstance(self.args, str): return self.args return pydantic_core.to_json(self.args).decode() def has_content(self) -> bool: """Return `True` if the arguments contain any data.""" if isinstance(self.args, dict): # TODO: This should probably return True if you have the value False, or 0, etc. # It makes sense to me to ignore empty strings, but not sure about empty lists or dicts return any(self.args.values()) else: return bool(self.args) __repr__ = _utils.dataclasses_no_defaults_repr ``` #### tool_name ```python tool_name: str ``` The name of the tool to call. #### args ```python args: str | dict[str, Any] | None = None ``` The arguments to pass to the tool. This is stored either as a JSON string or a Python dictionary depending on how data was received. #### tool_call_id ```python tool_call_id: str = field( default_factory=generate_tool_call_id ) ``` The tool call identifier, this is used by some models including OpenAI. In case the tool call id is not provided by the model, Pydantic AI will generate a random one. #### args_as_dict ```python args_as_dict() -> dict[str, Any] ``` Return the arguments as a Python dictionary. This is just for convenience with models that require dicts as input. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python def args_as_dict(self) -> dict[str, Any]: """Return the arguments as a Python dictionary. This is just for convenience with models that require dicts as input. """ if not self.args: return {} if isinstance(self.args, dict): return self.args args = pydantic_core.from_json(self.args) assert isinstance(args, dict), 'args should be a dict' return cast(dict[str, Any], args) ``` #### args_as_json_str ```python args_as_json_str() -> str ``` Return the arguments as a JSON string. This is just for convenience with models that require JSON strings as input. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python def args_as_json_str(self) -> str: """Return the arguments as a JSON string. This is just for convenience with models that require JSON strings as input. """ if not self.args: return '{}' if isinstance(self.args, str): return self.args return pydantic_core.to_json(self.args).decode() ``` #### has_content ```python has_content() -> bool ``` Return `True` if the arguments contain any data. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python def has_content(self) -> bool: """Return `True` if the arguments contain any data.""" if isinstance(self.args, dict): # TODO: This should probably return True if you have the value False, or 0, etc. # It makes sense to me to ignore empty strings, but not sure about empty lists or dicts return any(self.args.values()) else: return bool(self.args) ``` ### ToolCallPart Bases: `BaseToolCallPart` A tool call from a model. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class ToolCallPart(BaseToolCallPart): """A tool call from a model.""" _: KW_ONLY part_kind: Literal['tool-call'] = 'tool-call' """Part type identifier, this is available on all parts as a discriminator.""" ``` #### part_kind ```python part_kind: Literal['tool-call'] = 'tool-call' ``` Part type identifier, this is available on all parts as a discriminator. ### BuiltinToolCallPart Bases: `BaseToolCallPart` A tool call to a built-in tool. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class BuiltinToolCallPart(BaseToolCallPart): """A tool call to a built-in tool.""" _: KW_ONLY provider_name: str | None = None """The name of the provider that generated the response. Built-in tool calls are only sent back to the same provider. """ part_kind: Literal['builtin-tool-call'] = 'builtin-tool-call' """Part type identifier, this is available on all parts as a discriminator.""" ``` #### provider_name ```python provider_name: str | None = None ``` The name of the provider that generated the response. Built-in tool calls are only sent back to the same provider. #### part_kind ```python part_kind: Literal["builtin-tool-call"] = ( "builtin-tool-call" ) ``` Part type identifier, this is available on all parts as a discriminator. ### ModelResponsePart ```python ModelResponsePart = Annotated[ TextPart | ToolCallPart | BuiltinToolCallPart | BuiltinToolReturnPart | ThinkingPart, Discriminator("part_kind"), ] ``` A message part returned by a model. ### ModelResponse A response from a model, e.g. a message from the model to the Pydantic AI app. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class ModelResponse: """A response from a model, e.g. a message from the model to the Pydantic AI app.""" parts: Sequence[ModelResponsePart] """The parts of the model message.""" _: KW_ONLY usage: RequestUsage = field(default_factory=RequestUsage) """Usage information for the request. This has a default to make tests easier, and to support loading old messages where usage will be missing. """ model_name: str | None = None """The name of the model that generated the response.""" timestamp: datetime = field(default_factory=_now_utc) """The timestamp of the response. If the model provides a timestamp in the response (as OpenAI does) that will be used. """ kind: Literal['response'] = 'response' """Message type identifier, this is available on all parts as a discriminator.""" provider_name: str | None = None """The name of the LLM provider that generated the response.""" provider_details: Annotated[ dict[str, Any] | None, # `vendor_details` is deprecated, but we still want to support deserializing model responses stored in a DB before the name was changed pydantic.Field(validation_alias=pydantic.AliasChoices('provider_details', 'vendor_details')), ] = None """Additional provider-specific details in a serializable format. This allows storing selected vendor-specific data that isn't mapped to standard ModelResponse fields. For OpenAI models, this may include 'logprobs', 'finish_reason', etc. """ provider_response_id: Annotated[ str | None, # `vendor_id` is deprecated, but we still want to support deserializing model responses stored in a DB before the name was changed pydantic.Field(validation_alias=pydantic.AliasChoices('provider_response_id', 'vendor_id')), ] = None """request ID as specified by the model provider. This can be used to track the specific request to the model.""" finish_reason: FinishReason | None = None """Reason the model finished generating the response, normalized to OpenTelemetry values.""" @deprecated('`price` is deprecated, use `cost` instead') def price(self) -> genai_types.PriceCalculation: # pragma: no cover return self.cost() def cost(self) -> genai_types.PriceCalculation: """Calculate the cost of the usage. Uses [`genai-prices`](https://github.com/pydantic/genai-prices). """ assert self.model_name, 'Model name is required to calculate price' return calc_price( self.usage, self.model_name, provider_id=self.provider_name, genai_request_timestamp=self.timestamp, ) def otel_events(self, settings: InstrumentationSettings) -> list[Event]: """Return OpenTelemetry events for the response.""" result: list[Event] = [] def new_event_body(): new_body: dict[str, Any] = {'role': 'assistant'} ev = Event('gen_ai.assistant.message', body=new_body) result.append(ev) return new_body body = new_event_body() for part in self.parts: if isinstance(part, ToolCallPart): body.setdefault('tool_calls', []).append( { 'id': part.tool_call_id, 'type': 'function', 'function': { 'name': part.tool_name, **({'arguments': part.args} if settings.include_content else {}), }, } ) elif isinstance(part, TextPart | ThinkingPart): kind = part.part_kind body.setdefault('content', []).append( {'kind': kind, **({'text': part.content} if settings.include_content else {})} ) if content := body.get('content'): text_content = content[0].get('text') if content == [{'kind': 'text', 'text': text_content}]: body['content'] = text_content return result def otel_message_parts(self, settings: InstrumentationSettings) -> list[_otel_messages.MessagePart]: parts: list[_otel_messages.MessagePart] = [] for part in self.parts: if isinstance(part, TextPart): parts.append( _otel_messages.TextPart( type='text', **({'content': part.content} if settings.include_content else {}), ) ) elif isinstance(part, ThinkingPart): parts.append( _otel_messages.ThinkingPart( type='thinking', **({'content': part.content} if settings.include_content else {}), ) ) elif isinstance(part, ToolCallPart): call_part = _otel_messages.ToolCallPart(type='tool_call', id=part.tool_call_id, name=part.tool_name) if settings.include_content and part.args is not None: from .models.instrumented import InstrumentedModel if isinstance(part.args, str): call_part['arguments'] = part.args else: call_part['arguments'] = {k: InstrumentedModel.serialize_any(v) for k, v in part.args.items()} parts.append(call_part) return parts @property @deprecated('`vendor_details` is deprecated, use `provider_details` instead') def vendor_details(self) -> dict[str, Any] | None: return self.provider_details @property @deprecated('`vendor_id` is deprecated, use `provider_response_id` instead') def vendor_id(self) -> str | None: return self.provider_response_id @property @deprecated('`provider_request_id` is deprecated, use `provider_response_id` instead') def provider_request_id(self) -> str | None: return self.provider_response_id __repr__ = _utils.dataclasses_no_defaults_repr ``` #### parts ```python parts: Sequence[ModelResponsePart] ``` The parts of the model message. #### usage ```python usage: RequestUsage = field(default_factory=RequestUsage) ``` Usage information for the request. This has a default to make tests easier, and to support loading old messages where usage will be missing. #### model_name ```python model_name: str | None = None ``` The name of the model that generated the response. #### timestamp ```python timestamp: datetime = field(default_factory=now_utc) ``` The timestamp of the response. If the model provides a timestamp in the response (as OpenAI does) that will be used. #### kind ```python kind: Literal['response'] = 'response' ``` Message type identifier, this is available on all parts as a discriminator. #### provider_name ```python provider_name: str | None = None ``` The name of the LLM provider that generated the response. #### provider_details ```python provider_details: Annotated[ dict[str, Any] | None, Field( validation_alias=AliasChoices( provider_details, vendor_details ) ), ] = None ``` Additional provider-specific details in a serializable format. This allows storing selected vendor-specific data that isn't mapped to standard ModelResponse fields. For OpenAI models, this may include 'logprobs', 'finish_reason', etc. #### provider_response_id ```python provider_response_id: Annotated[ str | None, Field( validation_alias=AliasChoices( provider_response_id, vendor_id ) ), ] = None ``` request ID as specified by the model provider. This can be used to track the specific request to the model. #### finish_reason ```python finish_reason: FinishReason | None = None ``` Reason the model finished generating the response, normalized to OpenTelemetry values. #### price ```python price() -> PriceCalculation ``` Deprecated `price` is deprecated, use `cost` instead Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @deprecated('`price` is deprecated, use `cost` instead') def price(self) -> genai_types.PriceCalculation: # pragma: no cover return self.cost() ``` #### cost ```python cost() -> PriceCalculation ``` Calculate the cost of the usage. Uses [`genai-prices`](https://github.com/pydantic/genai-prices). Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python def cost(self) -> genai_types.PriceCalculation: """Calculate the cost of the usage. Uses [`genai-prices`](https://github.com/pydantic/genai-prices). """ assert self.model_name, 'Model name is required to calculate price' return calc_price( self.usage, self.model_name, provider_id=self.provider_name, genai_request_timestamp=self.timestamp, ) ``` #### otel_events ```python otel_events( settings: InstrumentationSettings, ) -> list[Event] ``` Return OpenTelemetry events for the response. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python def otel_events(self, settings: InstrumentationSettings) -> list[Event]: """Return OpenTelemetry events for the response.""" result: list[Event] = [] def new_event_body(): new_body: dict[str, Any] = {'role': 'assistant'} ev = Event('gen_ai.assistant.message', body=new_body) result.append(ev) return new_body body = new_event_body() for part in self.parts: if isinstance(part, ToolCallPart): body.setdefault('tool_calls', []).append( { 'id': part.tool_call_id, 'type': 'function', 'function': { 'name': part.tool_name, **({'arguments': part.args} if settings.include_content else {}), }, } ) elif isinstance(part, TextPart | ThinkingPart): kind = part.part_kind body.setdefault('content', []).append( {'kind': kind, **({'text': part.content} if settings.include_content else {})} ) if content := body.get('content'): text_content = content[0].get('text') if content == [{'kind': 'text', 'text': text_content}]: body['content'] = text_content return result ``` ### ModelMessage ```python ModelMessage = Annotated[ ModelRequest | ModelResponse, Discriminator("kind") ] ``` Any message sent to or returned by a model. ### ModelMessagesTypeAdapter ```python ModelMessagesTypeAdapter = TypeAdapter( list[ModelMessage], config=ConfigDict( defer_build=True, ser_json_bytes="base64", val_json_bytes="base64", ), ) ``` Pydantic TypeAdapter for (de)serializing messages. ### TextPartDelta A partial update (delta) for a `TextPart` to append new text content. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class TextPartDelta: """A partial update (delta) for a `TextPart` to append new text content.""" content_delta: str """The incremental text content to add to the existing `TextPart` content.""" _: KW_ONLY part_delta_kind: Literal['text'] = 'text' """Part delta type identifier, used as a discriminator.""" def apply(self, part: ModelResponsePart) -> TextPart: """Apply this text delta to an existing `TextPart`. Args: part: The existing model response part, which must be a `TextPart`. Returns: A new `TextPart` with updated text content. Raises: ValueError: If `part` is not a `TextPart`. """ if not isinstance(part, TextPart): raise ValueError('Cannot apply TextPartDeltas to non-TextParts') # pragma: no cover return replace(part, content=part.content + self.content_delta) __repr__ = _utils.dataclasses_no_defaults_repr ``` #### content_delta ```python content_delta: str ``` The incremental text content to add to the existing `TextPart` content. #### part_delta_kind ```python part_delta_kind: Literal['text'] = 'text' ``` Part delta type identifier, used as a discriminator. #### apply ```python apply(part: ModelResponsePart) -> TextPart ``` Apply this text delta to an existing `TextPart`. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `part` | `ModelResponsePart` | The existing model response part, which must be a TextPart. | *required* | Returns: | Type | Description | | --- | --- | | `TextPart` | A new TextPart with updated text content. | Raises: | Type | Description | | --- | --- | | `ValueError` | If part is not a TextPart. | Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python def apply(self, part: ModelResponsePart) -> TextPart: """Apply this text delta to an existing `TextPart`. Args: part: The existing model response part, which must be a `TextPart`. Returns: A new `TextPart` with updated text content. Raises: ValueError: If `part` is not a `TextPart`. """ if not isinstance(part, TextPart): raise ValueError('Cannot apply TextPartDeltas to non-TextParts') # pragma: no cover return replace(part, content=part.content + self.content_delta) ``` ### ThinkingPartDelta A partial update (delta) for a `ThinkingPart` to append new thinking content. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False, kw_only=True) class ThinkingPartDelta: """A partial update (delta) for a `ThinkingPart` to append new thinking content.""" content_delta: str | None = None """The incremental thinking content to add to the existing `ThinkingPart` content.""" signature_delta: str | None = None """Optional signature delta. Note this is never treated as a delta — it can replace None. """ provider_name: str | None = None """Optional provider name for the thinking part. Signatures are only sent back to the same provider. """ part_delta_kind: Literal['thinking'] = 'thinking' """Part delta type identifier, used as a discriminator.""" @overload def apply(self, part: ModelResponsePart) -> ThinkingPart: ... @overload def apply(self, part: ModelResponsePart | ThinkingPartDelta) -> ThinkingPart | ThinkingPartDelta: ... def apply(self, part: ModelResponsePart | ThinkingPartDelta) -> ThinkingPart | ThinkingPartDelta: """Apply this thinking delta to an existing `ThinkingPart`. Args: part: The existing model response part, which must be a `ThinkingPart`. Returns: A new `ThinkingPart` with updated thinking content. Raises: ValueError: If `part` is not a `ThinkingPart`. """ if isinstance(part, ThinkingPart): new_content = part.content + self.content_delta if self.content_delta else part.content new_signature = self.signature_delta if self.signature_delta is not None else part.signature new_provider_name = self.provider_name if self.provider_name is not None else part.provider_name return replace(part, content=new_content, signature=new_signature, provider_name=new_provider_name) elif isinstance(part, ThinkingPartDelta): if self.content_delta is None and self.signature_delta is None: raise ValueError('Cannot apply ThinkingPartDelta with no content or signature') if self.content_delta is not None: part = replace(part, content_delta=(part.content_delta or '') + self.content_delta) if self.signature_delta is not None: part = replace(part, signature_delta=self.signature_delta) if self.provider_name is not None: part = replace(part, provider_name=self.provider_name) return part raise ValueError( # pragma: no cover f'Cannot apply ThinkingPartDeltas to non-ThinkingParts or non-ThinkingPartDeltas ({part=}, {self=})' ) __repr__ = _utils.dataclasses_no_defaults_repr ``` #### content_delta ```python content_delta: str | None = None ``` The incremental thinking content to add to the existing `ThinkingPart` content. #### signature_delta ```python signature_delta: str | None = None ``` Optional signature delta. Note this is never treated as a delta — it can replace None. #### provider_name ```python provider_name: str | None = None ``` Optional provider name for the thinking part. Signatures are only sent back to the same provider. #### part_delta_kind ```python part_delta_kind: Literal['thinking'] = 'thinking' ``` Part delta type identifier, used as a discriminator. #### apply ```python apply(part: ModelResponsePart) -> ThinkingPart ``` ```python apply( part: ModelResponsePart | ThinkingPartDelta, ) -> ThinkingPart | ThinkingPartDelta ``` ```python apply( part: ModelResponsePart | ThinkingPartDelta, ) -> ThinkingPart | ThinkingPartDelta ``` Apply this thinking delta to an existing `ThinkingPart`. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `part` | `ModelResponsePart | ThinkingPartDelta` | The existing model response part, which must be a ThinkingPart. | *required* | Returns: | Type | Description | | --- | --- | | `ThinkingPart | ThinkingPartDelta` | A new ThinkingPart with updated thinking content. | Raises: | Type | Description | | --- | --- | | `ValueError` | If part is not a ThinkingPart. | Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python def apply(self, part: ModelResponsePart | ThinkingPartDelta) -> ThinkingPart | ThinkingPartDelta: """Apply this thinking delta to an existing `ThinkingPart`. Args: part: The existing model response part, which must be a `ThinkingPart`. Returns: A new `ThinkingPart` with updated thinking content. Raises: ValueError: If `part` is not a `ThinkingPart`. """ if isinstance(part, ThinkingPart): new_content = part.content + self.content_delta if self.content_delta else part.content new_signature = self.signature_delta if self.signature_delta is not None else part.signature new_provider_name = self.provider_name if self.provider_name is not None else part.provider_name return replace(part, content=new_content, signature=new_signature, provider_name=new_provider_name) elif isinstance(part, ThinkingPartDelta): if self.content_delta is None and self.signature_delta is None: raise ValueError('Cannot apply ThinkingPartDelta with no content or signature') if self.content_delta is not None: part = replace(part, content_delta=(part.content_delta or '') + self.content_delta) if self.signature_delta is not None: part = replace(part, signature_delta=self.signature_delta) if self.provider_name is not None: part = replace(part, provider_name=self.provider_name) return part raise ValueError( # pragma: no cover f'Cannot apply ThinkingPartDeltas to non-ThinkingParts or non-ThinkingPartDeltas ({part=}, {self=})' ) ``` ### ToolCallPartDelta A partial update (delta) for a `ToolCallPart` to modify tool name, arguments, or tool call ID. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False, kw_only=True) class ToolCallPartDelta: """A partial update (delta) for a `ToolCallPart` to modify tool name, arguments, or tool call ID.""" tool_name_delta: str | None = None """Incremental text to add to the existing tool name, if any.""" args_delta: str | dict[str, Any] | None = None """Incremental data to add to the tool arguments. If this is a string, it will be appended to existing JSON arguments. If this is a dict, it will be merged with existing dict arguments. """ tool_call_id: str | None = None """Optional tool call identifier, this is used by some models including OpenAI. Note this is never treated as a delta — it can replace None, but otherwise if a non-matching value is provided an error will be raised.""" part_delta_kind: Literal['tool_call'] = 'tool_call' """Part delta type identifier, used as a discriminator.""" def as_part(self) -> ToolCallPart | None: """Convert this delta to a fully formed `ToolCallPart` if possible, otherwise return `None`. Returns: A `ToolCallPart` if `tool_name_delta` is set, otherwise `None`. """ if self.tool_name_delta is None: return None return ToolCallPart(self.tool_name_delta, self.args_delta, self.tool_call_id or _generate_tool_call_id()) @overload def apply(self, part: ModelResponsePart) -> ToolCallPart: ... @overload def apply(self, part: ModelResponsePart | ToolCallPartDelta) -> ToolCallPart | ToolCallPartDelta: ... def apply(self, part: ModelResponsePart | ToolCallPartDelta) -> ToolCallPart | ToolCallPartDelta: """Apply this delta to a part or delta, returning a new part or delta with the changes applied. Args: part: The existing model response part or delta to update. Returns: Either a new `ToolCallPart` or an updated `ToolCallPartDelta`. Raises: ValueError: If `part` is neither a `ToolCallPart` nor a `ToolCallPartDelta`. UnexpectedModelBehavior: If applying JSON deltas to dict arguments or vice versa. """ if isinstance(part, ToolCallPart): return self._apply_to_part(part) if isinstance(part, ToolCallPartDelta): return self._apply_to_delta(part) raise ValueError( # pragma: no cover f'Can only apply ToolCallPartDeltas to ToolCallParts or ToolCallPartDeltas, not {part}' ) def _apply_to_delta(self, delta: ToolCallPartDelta) -> ToolCallPart | ToolCallPartDelta: """Internal helper to apply this delta to another delta.""" if self.tool_name_delta: # Append incremental text to the existing tool_name_delta updated_tool_name_delta = (delta.tool_name_delta or '') + self.tool_name_delta delta = replace(delta, tool_name_delta=updated_tool_name_delta) if isinstance(self.args_delta, str): if isinstance(delta.args_delta, dict): raise UnexpectedModelBehavior( f'Cannot apply JSON deltas to non-JSON tool arguments ({delta=}, {self=})' ) updated_args_delta = (delta.args_delta or '') + self.args_delta delta = replace(delta, args_delta=updated_args_delta) elif isinstance(self.args_delta, dict): if isinstance(delta.args_delta, str): raise UnexpectedModelBehavior( f'Cannot apply dict deltas to non-dict tool arguments ({delta=}, {self=})' ) updated_args_delta = {**(delta.args_delta or {}), **self.args_delta} delta = replace(delta, args_delta=updated_args_delta) if self.tool_call_id: delta = replace(delta, tool_call_id=self.tool_call_id) # If we now have enough data to create a full ToolCallPart, do so if delta.tool_name_delta is not None: return ToolCallPart(delta.tool_name_delta, delta.args_delta, delta.tool_call_id or _generate_tool_call_id()) return delta def _apply_to_part(self, part: ToolCallPart) -> ToolCallPart: """Internal helper to apply this delta directly to a `ToolCallPart`.""" if self.tool_name_delta: # Append incremental text to the existing tool_name tool_name = part.tool_name + self.tool_name_delta part = replace(part, tool_name=tool_name) if isinstance(self.args_delta, str): if isinstance(part.args, dict): raise UnexpectedModelBehavior(f'Cannot apply JSON deltas to non-JSON tool arguments ({part=}, {self=})') updated_json = (part.args or '') + self.args_delta part = replace(part, args=updated_json) elif isinstance(self.args_delta, dict): if isinstance(part.args, str): raise UnexpectedModelBehavior(f'Cannot apply dict deltas to non-dict tool arguments ({part=}, {self=})') updated_dict = {**(part.args or {}), **self.args_delta} part = replace(part, args=updated_dict) if self.tool_call_id: part = replace(part, tool_call_id=self.tool_call_id) return part __repr__ = _utils.dataclasses_no_defaults_repr ``` #### tool_name_delta ```python tool_name_delta: str | None = None ``` Incremental text to add to the existing tool name, if any. #### args_delta ```python args_delta: str | dict[str, Any] | None = None ``` Incremental data to add to the tool arguments. If this is a string, it will be appended to existing JSON arguments. If this is a dict, it will be merged with existing dict arguments. #### tool_call_id ```python tool_call_id: str | None = None ``` Optional tool call identifier, this is used by some models including OpenAI. Note this is never treated as a delta — it can replace None, but otherwise if a non-matching value is provided an error will be raised. #### part_delta_kind ```python part_delta_kind: Literal['tool_call'] = 'tool_call' ``` Part delta type identifier, used as a discriminator. #### as_part ```python as_part() -> ToolCallPart | None ``` Convert this delta to a fully formed `ToolCallPart` if possible, otherwise return `None`. Returns: | Type | Description | | --- | --- | | `ToolCallPart | None` | A ToolCallPart if tool_name_delta is set, otherwise None. | Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python def as_part(self) -> ToolCallPart | None: """Convert this delta to a fully formed `ToolCallPart` if possible, otherwise return `None`. Returns: A `ToolCallPart` if `tool_name_delta` is set, otherwise `None`. """ if self.tool_name_delta is None: return None return ToolCallPart(self.tool_name_delta, self.args_delta, self.tool_call_id or _generate_tool_call_id()) ``` #### apply ```python apply(part: ModelResponsePart) -> ToolCallPart ``` ```python apply( part: ModelResponsePart | ToolCallPartDelta, ) -> ToolCallPart | ToolCallPartDelta ``` ```python apply( part: ModelResponsePart | ToolCallPartDelta, ) -> ToolCallPart | ToolCallPartDelta ``` Apply this delta to a part or delta, returning a new part or delta with the changes applied. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `part` | `ModelResponsePart | ToolCallPartDelta` | The existing model response part or delta to update. | *required* | Returns: | Type | Description | | --- | --- | | `ToolCallPart | ToolCallPartDelta` | Either a new ToolCallPart or an updated ToolCallPartDelta. | Raises: | Type | Description | | --- | --- | | `ValueError` | If part is neither a ToolCallPart nor a ToolCallPartDelta. | | `UnexpectedModelBehavior` | If applying JSON deltas to dict arguments or vice versa. | Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python def apply(self, part: ModelResponsePart | ToolCallPartDelta) -> ToolCallPart | ToolCallPartDelta: """Apply this delta to a part or delta, returning a new part or delta with the changes applied. Args: part: The existing model response part or delta to update. Returns: Either a new `ToolCallPart` or an updated `ToolCallPartDelta`. Raises: ValueError: If `part` is neither a `ToolCallPart` nor a `ToolCallPartDelta`. UnexpectedModelBehavior: If applying JSON deltas to dict arguments or vice versa. """ if isinstance(part, ToolCallPart): return self._apply_to_part(part) if isinstance(part, ToolCallPartDelta): return self._apply_to_delta(part) raise ValueError( # pragma: no cover f'Can only apply ToolCallPartDeltas to ToolCallParts or ToolCallPartDeltas, not {part}' ) ``` ### ModelResponsePartDelta ```python ModelResponsePartDelta = Annotated[ TextPartDelta | ThinkingPartDelta | ToolCallPartDelta, Discriminator("part_delta_kind"), ] ``` A partial update (delta) for any model response part. ### PartStartEvent An event indicating that a new part has started. If multiple `PartStartEvent`s are received with the same index, the new one should fully replace the old one. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False, kw_only=True) class PartStartEvent: """An event indicating that a new part has started. If multiple `PartStartEvent`s are received with the same index, the new one should fully replace the old one. """ index: int """The index of the part within the overall response parts list.""" part: ModelResponsePart """The newly started `ModelResponsePart`.""" event_kind: Literal['part_start'] = 'part_start' """Event type identifier, used as a discriminator.""" __repr__ = _utils.dataclasses_no_defaults_repr ``` #### index ```python index: int ``` The index of the part within the overall response parts list. #### part ```python part: ModelResponsePart ``` The newly started `ModelResponsePart`. #### event_kind ```python event_kind: Literal['part_start'] = 'part_start' ``` Event type identifier, used as a discriminator. ### PartDeltaEvent An event indicating a delta update for an existing part. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False, kw_only=True) class PartDeltaEvent: """An event indicating a delta update for an existing part.""" index: int """The index of the part within the overall response parts list.""" delta: ModelResponsePartDelta """The delta to apply to the specified part.""" event_kind: Literal['part_delta'] = 'part_delta' """Event type identifier, used as a discriminator.""" __repr__ = _utils.dataclasses_no_defaults_repr ``` #### index ```python index: int ``` The index of the part within the overall response parts list. #### delta ```python delta: ModelResponsePartDelta ``` The delta to apply to the specified part. #### event_kind ```python event_kind: Literal['part_delta'] = 'part_delta' ``` Event type identifier, used as a discriminator. ### FinalResultEvent An event indicating the response to the current model request matches the output schema and will produce a result. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False, kw_only=True) class FinalResultEvent: """An event indicating the response to the current model request matches the output schema and will produce a result.""" tool_name: str | None """The name of the output tool that was called. `None` if the result is from text content and not from a tool.""" tool_call_id: str | None """The tool call ID, if any, that this result is associated with.""" event_kind: Literal['final_result'] = 'final_result' """Event type identifier, used as a discriminator.""" __repr__ = _utils.dataclasses_no_defaults_repr ``` #### tool_name ```python tool_name: str | None ``` The name of the output tool that was called. `None` if the result is from text content and not from a tool. #### tool_call_id ```python tool_call_id: str | None ``` The tool call ID, if any, that this result is associated with. #### event_kind ```python event_kind: Literal['final_result'] = 'final_result' ``` Event type identifier, used as a discriminator. ### ModelResponseStreamEvent ```python ModelResponseStreamEvent = Annotated[ PartStartEvent | PartDeltaEvent | FinalResultEvent, Discriminator("event_kind"), ] ``` An event in the model response stream, starting a new part, applying a delta to an existing one, or indicating the final result. ### FunctionToolCallEvent An event indicating the start to a call to a function tool. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class FunctionToolCallEvent: """An event indicating the start to a call to a function tool.""" part: ToolCallPart """The (function) tool call to make.""" _: KW_ONLY event_kind: Literal['function_tool_call'] = 'function_tool_call' """Event type identifier, used as a discriminator.""" @property def tool_call_id(self) -> str: """An ID used for matching details about the call to its result.""" return self.part.tool_call_id @property @deprecated('`call_id` is deprecated, use `tool_call_id` instead.') def call_id(self) -> str: """An ID used for matching details about the call to its result.""" return self.part.tool_call_id # pragma: no cover __repr__ = _utils.dataclasses_no_defaults_repr ``` #### part ```python part: ToolCallPart ``` The (function) tool call to make. #### event_kind ```python event_kind: Literal["function_tool_call"] = ( "function_tool_call" ) ``` Event type identifier, used as a discriminator. #### tool_call_id ```python tool_call_id: str ``` An ID used for matching details about the call to its result. #### call_id ```python call_id: str ``` An ID used for matching details about the call to its result. ### FunctionToolResultEvent An event indicating the result of a function tool call. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class FunctionToolResultEvent: """An event indicating the result of a function tool call.""" result: ToolReturnPart | RetryPromptPart """The result of the call to the function tool.""" _: KW_ONLY event_kind: Literal['function_tool_result'] = 'function_tool_result' """Event type identifier, used as a discriminator.""" @property def tool_call_id(self) -> str: """An ID used to match the result to its original call.""" return self.result.tool_call_id __repr__ = _utils.dataclasses_no_defaults_repr ``` #### result ```python result: ToolReturnPart | RetryPromptPart ``` The result of the call to the function tool. #### event_kind ```python event_kind: Literal["function_tool_result"] = ( "function_tool_result" ) ``` Event type identifier, used as a discriminator. #### tool_call_id ```python tool_call_id: str ``` An ID used to match the result to its original call. ### BuiltinToolCallEvent An event indicating the start to a call to a built-in tool. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class BuiltinToolCallEvent: """An event indicating the start to a call to a built-in tool.""" part: BuiltinToolCallPart """The built-in tool call to make.""" _: KW_ONLY event_kind: Literal['builtin_tool_call'] = 'builtin_tool_call' """Event type identifier, used as a discriminator.""" ``` #### part ```python part: BuiltinToolCallPart ``` The built-in tool call to make. #### event_kind ```python event_kind: Literal["builtin_tool_call"] = ( "builtin_tool_call" ) ``` Event type identifier, used as a discriminator. ### BuiltinToolResultEvent An event indicating the result of a built-in tool call. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class BuiltinToolResultEvent: """An event indicating the result of a built-in tool call.""" result: BuiltinToolReturnPart """The result of the call to the built-in tool.""" _: KW_ONLY event_kind: Literal['builtin_tool_result'] = 'builtin_tool_result' """Event type identifier, used as a discriminator.""" ``` #### result ```python result: BuiltinToolReturnPart ``` The result of the call to the built-in tool. #### event_kind ```python event_kind: Literal["builtin_tool_result"] = ( "builtin_tool_result" ) ``` Event type identifier, used as a discriminator. ### HandleResponseEvent ```python HandleResponseEvent = Annotated[ FunctionToolCallEvent | FunctionToolResultEvent | BuiltinToolCallEvent | BuiltinToolResultEvent, Discriminator("event_kind"), ] ``` An event yielded when handling a model response, indicating tool calls and results. ### AgentStreamEvent ```python AgentStreamEvent = Annotated[ ModelResponseStreamEvent | HandleResponseEvent, Discriminator("event_kind"), ] ``` An event in the agent stream: model response stream events and response-handling events. # `pydantic_ai.output` ### OutputDataT ```python OutputDataT = TypeVar( "OutputDataT", default=str, covariant=True ) ``` Covariant type variable for the output data type of a run. ### ToolOutput Bases: `Generic[OutputDataT]` Marker class to use a tool for output and optionally customize the tool. Example: tool_output.py ```python from pydantic import BaseModel from pydantic_ai import Agent, ToolOutput class Fruit(BaseModel): name: str color: str class Vehicle(BaseModel): name: str wheels: int agent = Agent( 'openai:gpt-4o', output_type=[ ToolOutput(Fruit, name='return_fruit'), ToolOutput(Vehicle, name='return_vehicle'), ], ) result = agent.run_sync('What is a banana?') print(repr(result.output)) #> Fruit(name='banana', color='yellow') ``` Source code in `pydantic_ai_slim/pydantic_ai/output.py` ````python @dataclass(init=False) class ToolOutput(Generic[OutputDataT]): """Marker class to use a tool for output and optionally customize the tool. Example: ```python {title="tool_output.py"} from pydantic import BaseModel from pydantic_ai import Agent, ToolOutput class Fruit(BaseModel): name: str color: str class Vehicle(BaseModel): name: str wheels: int agent = Agent( 'openai:gpt-4o', output_type=[ ToolOutput(Fruit, name='return_fruit'), ToolOutput(Vehicle, name='return_vehicle'), ], ) result = agent.run_sync('What is a banana?') print(repr(result.output)) #> Fruit(name='banana', color='yellow') ``` """ output: OutputTypeOrFunction[OutputDataT] """An output type or function.""" name: str | None """The name of the tool that will be passed to the model. If not specified and only one output is provided, `final_result` will be used. If multiple outputs are provided, the name of the output type or function will be added to the tool name.""" description: str | None """The description of the tool that will be passed to the model. If not specified, the docstring of the output type or function will be used.""" max_retries: int | None """The maximum number of retries for the tool.""" strict: bool | None """Whether to use strict mode for the tool.""" def __init__( self, type_: OutputTypeOrFunction[OutputDataT], *, name: str | None = None, description: str | None = None, max_retries: int | None = None, strict: bool | None = None, ): self.output = type_ self.name = name self.description = description self.max_retries = max_retries self.strict = strict ```` #### output ```python output: OutputTypeOrFunction[OutputDataT] = type_ ``` An output type or function. #### name ```python name: str | None = name ``` The name of the tool that will be passed to the model. If not specified and only one output is provided, `final_result` will be used. If multiple outputs are provided, the name of the output type or function will be added to the tool name. #### description ```python description: str | None = description ``` The description of the tool that will be passed to the model. If not specified, the docstring of the output type or function will be used. #### max_retries ```python max_retries: int | None = max_retries ``` The maximum number of retries for the tool. #### strict ```python strict: bool | None = strict ``` Whether to use strict mode for the tool. ### NativeOutput Bases: `Generic[OutputDataT]` Marker class to use the model's native structured outputs functionality for outputs and optionally customize the name and description. Example: native_output.py ```python from pydantic_ai import Agent, NativeOutput from tool_output import Fruit, Vehicle agent = Agent( 'openai:gpt-4o', output_type=NativeOutput( [Fruit, Vehicle], name='Fruit or vehicle', description='Return a fruit or vehicle.' ), ) result = agent.run_sync('What is a Ford Explorer?') print(repr(result.output)) #> Vehicle(name='Ford Explorer', wheels=4) ``` Source code in `pydantic_ai_slim/pydantic_ai/output.py` ````python @dataclass(init=False) class NativeOutput(Generic[OutputDataT]): """Marker class to use the model's native structured outputs functionality for outputs and optionally customize the name and description. Example: ```python {title="native_output.py" requires="tool_output.py"} from pydantic_ai import Agent, NativeOutput from tool_output import Fruit, Vehicle agent = Agent( 'openai:gpt-4o', output_type=NativeOutput( [Fruit, Vehicle], name='Fruit or vehicle', description='Return a fruit or vehicle.' ), ) result = agent.run_sync('What is a Ford Explorer?') print(repr(result.output)) #> Vehicle(name='Ford Explorer', wheels=4) ``` """ outputs: OutputTypeOrFunction[OutputDataT] | Sequence[OutputTypeOrFunction[OutputDataT]] """The output types or functions.""" name: str | None """The name of the structured output that will be passed to the model. If not specified and only one output is provided, the name of the output type or function will be used.""" description: str | None """The description of the structured output that will be passed to the model. If not specified and only one output is provided, the docstring of the output type or function will be used.""" strict: bool | None """Whether to use strict mode for the output, if the model supports it.""" def __init__( self, outputs: OutputTypeOrFunction[OutputDataT] | Sequence[OutputTypeOrFunction[OutputDataT]], *, name: str | None = None, description: str | None = None, strict: bool | None = None, ): self.outputs = outputs self.name = name self.description = description self.strict = strict ```` #### outputs ```python outputs: ( OutputTypeOrFunction[OutputDataT] | Sequence[OutputTypeOrFunction[OutputDataT]] ) = outputs ``` The output types or functions. #### name ```python name: str | None = name ``` The name of the structured output that will be passed to the model. If not specified and only one output is provided, the name of the output type or function will be used. #### description ```python description: str | None = description ``` The description of the structured output that will be passed to the model. If not specified and only one output is provided, the docstring of the output type or function will be used. #### strict ```python strict: bool | None = strict ``` Whether to use strict mode for the output, if the model supports it. ### PromptedOutput Bases: `Generic[OutputDataT]` Marker class to use a prompt to tell the model what to output and optionally customize the prompt. Example: prompted_output.py ```python from pydantic import BaseModel from pydantic_ai import Agent, PromptedOutput from tool_output import Vehicle class Device(BaseModel): name: str kind: str agent = Agent( 'openai:gpt-4o', output_type=PromptedOutput( [Vehicle, Device], name='Vehicle or device', description='Return a vehicle or device.' ), ) result = agent.run_sync('What is a MacBook?') print(repr(result.output)) #> Device(name='MacBook', kind='laptop') agent = Agent( 'openai:gpt-4o', output_type=PromptedOutput( [Vehicle, Device], template='Gimme some JSON: {schema}' ), ) result = agent.run_sync('What is a Ford Explorer?') print(repr(result.output)) #> Vehicle(name='Ford Explorer', wheels=4) ``` Source code in `pydantic_ai_slim/pydantic_ai/output.py` ````python @dataclass(init=False) class PromptedOutput(Generic[OutputDataT]): """Marker class to use a prompt to tell the model what to output and optionally customize the prompt. Example: ```python {title="prompted_output.py" requires="tool_output.py"} from pydantic import BaseModel from pydantic_ai import Agent, PromptedOutput from tool_output import Vehicle class Device(BaseModel): name: str kind: str agent = Agent( 'openai:gpt-4o', output_type=PromptedOutput( [Vehicle, Device], name='Vehicle or device', description='Return a vehicle or device.' ), ) result = agent.run_sync('What is a MacBook?') print(repr(result.output)) #> Device(name='MacBook', kind='laptop') agent = Agent( 'openai:gpt-4o', output_type=PromptedOutput( [Vehicle, Device], template='Gimme some JSON: {schema}' ), ) result = agent.run_sync('What is a Ford Explorer?') print(repr(result.output)) #> Vehicle(name='Ford Explorer', wheels=4) ``` """ outputs: OutputTypeOrFunction[OutputDataT] | Sequence[OutputTypeOrFunction[OutputDataT]] """The output types or functions.""" name: str | None """The name of the structured output that will be passed to the model. If not specified and only one output is provided, the name of the output type or function will be used.""" description: str | None """The description that will be passed to the model. If not specified and only one output is provided, the docstring of the output type or function will be used.""" template: str | None """Template for the prompt passed to the model. The '{schema}' placeholder will be replaced with the output JSON schema. If not specified, the default template specified on the model's profile will be used. """ def __init__( self, outputs: OutputTypeOrFunction[OutputDataT] | Sequence[OutputTypeOrFunction[OutputDataT]], *, name: str | None = None, description: str | None = None, template: str | None = None, ): self.outputs = outputs self.name = name self.description = description self.template = template ```` #### outputs ```python outputs: ( OutputTypeOrFunction[OutputDataT] | Sequence[OutputTypeOrFunction[OutputDataT]] ) = outputs ``` The output types or functions. #### name ```python name: str | None = name ``` The name of the structured output that will be passed to the model. If not specified and only one output is provided, the name of the output type or function will be used. #### description ```python description: str | None = description ``` The description that will be passed to the model. If not specified and only one output is provided, the docstring of the output type or function will be used. #### template ```python template: str | None = template ``` Template for the prompt passed to the model. The '{schema}' placeholder will be replaced with the output JSON schema. If not specified, the default template specified on the model's profile will be used. ### TextOutput Bases: `Generic[OutputDataT]` Marker class to use text output for an output function taking a string argument. Example: ```python from pydantic_ai import Agent, TextOutput def split_into_words(text: str) -> list[str]: return text.split() agent = Agent( 'openai:gpt-4o', output_type=TextOutput(split_into_words), ) result = agent.run_sync('Who was Albert Einstein?') print(result.output) #> ['Albert', 'Einstein', 'was', 'a', 'German-born', 'theoretical', 'physicist.'] ``` Source code in `pydantic_ai_slim/pydantic_ai/output.py` ````python @dataclass class TextOutput(Generic[OutputDataT]): """Marker class to use text output for an output function taking a string argument. Example: ```python from pydantic_ai import Agent, TextOutput def split_into_words(text: str) -> list[str]: return text.split() agent = Agent( 'openai:gpt-4o', output_type=TextOutput(split_into_words), ) result = agent.run_sync('Who was Albert Einstein?') print(result.output) #> ['Albert', 'Einstein', 'was', 'a', 'German-born', 'theoretical', 'physicist.'] ``` """ output_function: TextOutputFunc[OutputDataT] """The function that will be called to process the model's plain text output. The function must take a single string argument.""" ```` #### output_function ```python output_function: TextOutputFunc[OutputDataT] ``` The function that will be called to process the model's plain text output. The function must take a single string argument. ### StructuredDict ```python StructuredDict( json_schema: JsonSchemaValue, name: str | None = None, description: str | None = None, ) -> type[JsonSchemaValue] ``` Returns a `dict[str, Any]` subclass with a JSON schema attached that will be used for structured output. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `json_schema` | `JsonSchemaValue` | A JSON schema of type object defining the structure of the dictionary content. | *required* | | `name` | `str | None` | Optional name of the structured output. If not provided, the title field of the JSON schema will be used if it's present. | `None` | | `description` | `str | None` | Optional description of the structured output. If not provided, the description field of the JSON schema will be used if it's present. | `None` | Example: structured_dict.py ```python from pydantic_ai import Agent, StructuredDict schema = { 'type': 'object', 'properties': { 'name': {'type': 'string'}, 'age': {'type': 'integer'} }, 'required': ['name', 'age'] } agent = Agent('openai:gpt-4o', output_type=StructuredDict(schema)) result = agent.run_sync('Create a person') print(result.output) #> {'name': 'John Doe', 'age': 30} ``` Source code in `pydantic_ai_slim/pydantic_ai/output.py` ````python def StructuredDict( json_schema: JsonSchemaValue, name: str | None = None, description: str | None = None ) -> type[JsonSchemaValue]: """Returns a `dict[str, Any]` subclass with a JSON schema attached that will be used for structured output. Args: json_schema: A JSON schema of type `object` defining the structure of the dictionary content. name: Optional name of the structured output. If not provided, the `title` field of the JSON schema will be used if it's present. description: Optional description of the structured output. If not provided, the `description` field of the JSON schema will be used if it's present. Example: ```python {title="structured_dict.py"} from pydantic_ai import Agent, StructuredDict schema = { 'type': 'object', 'properties': { 'name': {'type': 'string'}, 'age': {'type': 'integer'} }, 'required': ['name', 'age'] } agent = Agent('openai:gpt-4o', output_type=StructuredDict(schema)) result = agent.run_sync('Create a person') print(result.output) #> {'name': 'John Doe', 'age': 30} ``` """ json_schema = _utils.check_object_json_schema(json_schema) if name: json_schema['title'] = name if description: json_schema['description'] = description class _StructuredDict(JsonSchemaValue): __is_model_like__ = True @classmethod def __get_pydantic_core_schema__( cls, source_type: Any, handler: GetCoreSchemaHandler ) -> core_schema.CoreSchema: return core_schema.dict_schema( keys_schema=core_schema.str_schema(), values_schema=core_schema.any_schema(), ) @classmethod def __get_pydantic_json_schema__( cls, core_schema: core_schema.CoreSchema, handler: GetJsonSchemaHandler ) -> JsonSchemaValue: return json_schema return _StructuredDict ```` ### DeferredToolRequests Tool calls that require approval or external execution. This can be used as an agent's `output_type` and will be used as the output of the agent run if the model called any deferred tools. Results can be passed to the next agent run using a DeferredToolResults object with the same tool call IDs. See [deferred tools docs](../../deferred-tools/#deferred-tools) for more information. Source code in `pydantic_ai_slim/pydantic_ai/tools.py` ```python @dataclass(kw_only=True) class DeferredToolRequests: """Tool calls that require approval or external execution. This can be used as an agent's `output_type` and will be used as the output of the agent run if the model called any deferred tools. Results can be passed to the next agent run using a [`DeferredToolResults`][pydantic_ai.tools.DeferredToolResults] object with the same tool call IDs. See [deferred tools docs](../deferred-tools.md#deferred-tools) for more information. """ calls: list[ToolCallPart] = field(default_factory=list) """Tool calls that require external execution.""" approvals: list[ToolCallPart] = field(default_factory=list) """Tool calls that require human-in-the-loop approval.""" ``` #### calls ```python calls: list[ToolCallPart] = field(default_factory=list) ``` Tool calls that require external execution. #### approvals ```python approvals: list[ToolCallPart] = field(default_factory=list) ``` Tool calls that require human-in-the-loop approval. # `pydantic_ai.profiles` Describes how requests to and responses from specific models or families of models need to be constructed and processed to get the best results, independent of the model and provider classes used. Source code in `pydantic_ai_slim/pydantic_ai/profiles/__init__.py` ```python @dataclass(kw_only=True) class ModelProfile: """Describes how requests to and responses from specific models or families of models need to be constructed and processed to get the best results, independent of the model and provider classes used.""" supports_tools: bool = True """Whether the model supports tools.""" supports_json_schema_output: bool = False """Whether the model supports JSON schema output.""" supports_json_object_output: bool = False """Whether the model supports JSON object output.""" default_structured_output_mode: StructuredOutputMode = 'tool' """The default structured output mode to use for the model.""" prompted_output_template: str = dedent( """ Always respond with a JSON object that's compatible with this schema: {schema} Don't include any text or Markdown fencing before or after. """ ) """The instructions template to use for prompted structured output. The '{schema}' placeholder will be replaced with the JSON schema for the output.""" json_schema_transformer: type[JsonSchemaTransformer] | None = None """The transformer to use to make JSON schemas for tools and structured output compatible with the model.""" thinking_tags: tuple[str, str] = ('', '') """The tags used to indicate thinking parts in the model's output. Defaults to ('', '').""" ignore_streamed_leading_whitespace: bool = False """Whether to ignore leading whitespace when streaming a response. This is a workaround for models that emit `\n\n\n` or an empty text part ahead of tool calls (e.g. Ollama + Qwen3), which we don't want to end up treating as a final result when using `run_stream` with `str` a valid `output_type`. This is currently only used by `OpenAIChatModel`, `HuggingFaceModel`, and `GroqModel`. """ @classmethod def from_profile(cls, profile: ModelProfile | None) -> Self: """Build a ModelProfile subclass instance from a ModelProfile instance.""" if isinstance(profile, cls): return profile return cls().update(profile) def update(self, profile: ModelProfile | None) -> Self: """Update this ModelProfile (subclass) instance with the non-default values from another ModelProfile instance.""" if not profile: return self field_names = set(f.name for f in fields(self)) non_default_attrs = { f.name: getattr(profile, f.name) for f in fields(profile) if f.name in field_names and getattr(profile, f.name) != f.default } return replace(self, **non_default_attrs) ``` ### supports_tools ```python supports_tools: bool = True ``` Whether the model supports tools. ### supports_json_schema_output ```python supports_json_schema_output: bool = False ``` Whether the model supports JSON schema output. ### supports_json_object_output ```python supports_json_object_output: bool = False ``` Whether the model supports JSON object output. ### default_structured_output_mode ```python default_structured_output_mode: StructuredOutputMode = ( "tool" ) ``` The default structured output mode to use for the model. ### prompted_output_template ```python prompted_output_template: str = dedent( "\n Always respond with a JSON object that's compatible with this schema:\n\n {schema}\n\n Don't include any text or Markdown fencing before or after.\n " ) ``` The instructions template to use for prompted structured output. The '{schema}' placeholder will be replaced with the JSON schema for the output. ### json_schema_transformer ```python json_schema_transformer: ( type[JsonSchemaTransformer] | None ) = None ``` The transformer to use to make JSON schemas for tools and structured output compatible with the model. ### thinking_tags ```python thinking_tags: tuple[str, str] = ('', '') ``` The tags used to indicate thinking parts in the model's output. Defaults to ('', ''). ### ignore_streamed_leading_whitespace ```python ignore_streamed_leading_whitespace: bool = False ``` Whether to ignore leading whitespace when streaming a response. ```text This is a workaround for models that emit ` ``` `or an empty text part ahead of tool calls (e.g. Ollama + Qwen3), which we don't want to end up treating as a final result when using`run_stream`with`str`a valid`output_type\`. ```text This is currently only used by `OpenAIChatModel`, `HuggingFaceModel`, and `GroqModel`. ``` ### from_profile ```python from_profile(profile: ModelProfile | None) -> Self ``` Build a ModelProfile subclass instance from a ModelProfile instance. Source code in `pydantic_ai_slim/pydantic_ai/profiles/__init__.py` ```python @classmethod def from_profile(cls, profile: ModelProfile | None) -> Self: """Build a ModelProfile subclass instance from a ModelProfile instance.""" if isinstance(profile, cls): return profile return cls().update(profile) ``` ### update ```python update(profile: ModelProfile | None) -> Self ``` Update this ModelProfile (subclass) instance with the non-default values from another ModelProfile instance. Source code in `pydantic_ai_slim/pydantic_ai/profiles/__init__.py` ```python def update(self, profile: ModelProfile | None) -> Self: """Update this ModelProfile (subclass) instance with the non-default values from another ModelProfile instance.""" if not profile: return self field_names = set(f.name for f in fields(self)) non_default_attrs = { f.name: getattr(profile, f.name) for f in fields(profile) if f.name in field_names and getattr(profile, f.name) != f.default } return replace(self, **non_default_attrs) ``` ### OpenAIModelProfile Bases: `ModelProfile` Profile for models used with `OpenAIChatModel`. ALL FIELDS MUST BE `openai_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. Source code in `pydantic_ai_slim/pydantic_ai/profiles/openai.py` ```python @dataclass(kw_only=True) class OpenAIModelProfile(ModelProfile): """Profile for models used with `OpenAIChatModel`. ALL FIELDS MUST BE `openai_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. """ openai_supports_strict_tool_definition: bool = True """This can be set by a provider or user if the OpenAI-"compatible" API doesn't support strict tool definitions.""" openai_supports_sampling_settings: bool = True """Turn off to don't send sampling settings like `temperature` and `top_p` to models that don't support them, like OpenAI's o-series reasoning models.""" openai_unsupported_model_settings: Sequence[str] = () """A list of model settings that are not supported by the model.""" # Some OpenAI-compatible providers (e.g. MoonshotAI) currently do **not** accept # `tool_choice="required"`. This flag lets the calling model know whether it's # safe to pass that value along. Default is `True` to preserve existing # behaviour for OpenAI itself and most providers. openai_supports_tool_choice_required: bool = True """Whether the provider accepts the value ``tool_choice='required'`` in the request payload.""" openai_system_prompt_role: OpenAISystemPromptRole | None = None """The role to use for the system prompt message. If not provided, defaults to `'system'`.""" openai_chat_supports_web_search: bool = False """Whether the model supports web search in Chat Completions API.""" openai_supports_encrypted_reasoning_content: bool = False """Whether the model supports including encrypted reasoning content in the response.""" def __post_init__(self): # pragma: no cover if not self.openai_supports_sampling_settings: warnings.warn( 'The `openai_supports_sampling_settings` has no effect, and it will be removed in future versions. ' 'Use `openai_unsupported_model_settings` instead.', DeprecationWarning, ) ``` #### openai_supports_strict_tool_definition ```python openai_supports_strict_tool_definition: bool = True ``` This can be set by a provider or user if the OpenAI-"compatible" API doesn't support strict tool definitions. #### openai_supports_sampling_settings ```python openai_supports_sampling_settings: bool = True ``` Turn off to don't send sampling settings like `temperature` and `top_p` to models that don't support them, like OpenAI's o-series reasoning models. #### openai_unsupported_model_settings ```python openai_unsupported_model_settings: Sequence[str] = () ``` A list of model settings that are not supported by the model. #### openai_supports_tool_choice_required ```python openai_supports_tool_choice_required: bool = True ``` Whether the provider accepts the value `tool_choice='required'` in the request payload. #### openai_system_prompt_role ```python openai_system_prompt_role: OpenAISystemPromptRole | None = ( None ) ``` The role to use for the system prompt message. If not provided, defaults to `'system'`. #### openai_chat_supports_web_search ```python openai_chat_supports_web_search: bool = False ``` Whether the model supports web search in Chat Completions API. #### openai_supports_encrypted_reasoning_content ```python openai_supports_encrypted_reasoning_content: bool = False ``` Whether the model supports including encrypted reasoning content in the response. ### openai_model_profile ```python openai_model_profile(model_name: str) -> ModelProfile ``` Get the model profile for an OpenAI model. Source code in `pydantic_ai_slim/pydantic_ai/profiles/openai.py` ```python def openai_model_profile(model_name: str) -> ModelProfile: """Get the model profile for an OpenAI model.""" is_reasoning_model = model_name.startswith('o') or model_name.startswith('gpt-5') # Check if the model supports web search (only specific search-preview models) supports_web_search = '-search-preview' in model_name # Structured Outputs (output mode 'native') is only supported with the gpt-4o-mini, gpt-4o-mini-2024-07-18, and gpt-4o-2024-08-06 model snapshots and later. # We leave it in here for all models because the `default_structured_output_mode` is `'tool'`, so `native` is only used # when the user specifically uses the `NativeOutput` marker, so an error from the API is acceptable. if is_reasoning_model: openai_unsupported_model_settings = ( 'temperature', 'top_p', 'presence_penalty', 'frequency_penalty', 'logit_bias', 'logprobs', 'top_logprobs', ) else: openai_unsupported_model_settings = () # The o1-mini model doesn't support the `system` role, so we default to `user`. # See https://github.com/pydantic/pydantic-ai/issues/974 for more details. openai_system_prompt_role = 'user' if model_name.startswith('o1-mini') else None return OpenAIModelProfile( json_schema_transformer=OpenAIJsonSchemaTransformer, supports_json_schema_output=True, supports_json_object_output=True, openai_unsupported_model_settings=openai_unsupported_model_settings, openai_system_prompt_role=openai_system_prompt_role, openai_chat_supports_web_search=supports_web_search, openai_supports_encrypted_reasoning_content=is_reasoning_model, ) ``` ### OpenAIJsonSchemaTransformer Bases: `JsonSchemaTransformer` Recursively handle the schema to make it compatible with OpenAI strict mode. See https://platform.openai.com/docs/guides/function-calling?api-mode=responses#strict-mode for more details, but this basically just requires: * `additionalProperties` must be set to false for each object in the parameters * all fields in properties must be marked as required Source code in `pydantic_ai_slim/pydantic_ai/profiles/openai.py` ```python @dataclass(init=False) class OpenAIJsonSchemaTransformer(JsonSchemaTransformer): """Recursively handle the schema to make it compatible with OpenAI strict mode. See https://platform.openai.com/docs/guides/function-calling?api-mode=responses#strict-mode for more details, but this basically just requires: * `additionalProperties` must be set to false for each object in the parameters * all fields in properties must be marked as required """ def __init__(self, schema: JsonSchema, *, strict: bool | None = None): super().__init__(schema, strict=strict) self.root_ref = schema.get('$ref') def walk(self) -> JsonSchema: # Note: OpenAI does not support anyOf at the root in strict mode # However, we don't need to check for it here because we ensure in pydantic_ai._utils.check_object_json_schema # that the root schema either has type 'object' or is recursive. result = super().walk() # For recursive models, we need to tweak the schema to make it compatible with strict mode. # Because the following should never change the semantics of the schema we apply it unconditionally. if self.root_ref is not None: result.pop('$ref', None) # We replace references to the self.root_ref with just '#' in the transform method root_key = re.sub(r'^#/\$defs/', '', self.root_ref) result.update(self.defs.get(root_key) or {}) return result def transform(self, schema: JsonSchema) -> JsonSchema: # noqa C901 # Remove unnecessary keys schema.pop('title', None) schema.pop('$schema', None) schema.pop('discriminator', None) default = schema.get('default', _sentinel) if default is not _sentinel: # the "default" keyword is not allowed in strict mode, but including it makes some Ollama models behave # better, so we keep it around when not strict if self.strict is True: schema.pop('default', None) elif self.strict is None: # pragma: no branch self.is_strict_compatible = False if schema_ref := schema.get('$ref'): if schema_ref == self.root_ref: schema['$ref'] = '#' if len(schema) > 1: # OpenAI Strict mode doesn't support siblings to "$ref", but _does_ allow siblings to "anyOf". # So if there is a "description" field or any other extra info, we move the "$ref" into an "anyOf": schema['anyOf'] = [{'$ref': schema.pop('$ref')}] # Track strict-incompatible keys incompatible_values: dict[str, Any] = {} for key in _STRICT_INCOMPATIBLE_KEYS: value = schema.get(key, _sentinel) if value is not _sentinel: incompatible_values[key] = value if format := schema.get('format'): if format not in _STRICT_COMPATIBLE_STRING_FORMATS: incompatible_values['format'] = format description = schema.get('description') if incompatible_values: if self.strict is True: notes: list[str] = [] for key, value in incompatible_values.items(): schema.pop(key) notes.append(f'{key}={value}') notes_string = ', '.join(notes) schema['description'] = notes_string if not description else f'{description} ({notes_string})' elif self.strict is None: # pragma: no branch self.is_strict_compatible = False schema_type = schema.get('type') if 'oneOf' in schema: # OpenAI does not support oneOf in strict mode if self.strict is True: schema['anyOf'] = schema.pop('oneOf') else: self.is_strict_compatible = False if schema_type == 'object': if self.strict is True: # additional properties are disallowed schema['additionalProperties'] = False # all properties are required if 'properties' not in schema: schema['properties'] = dict[str, Any]() schema['required'] = list(schema['properties'].keys()) elif self.strict is None: if schema.get('additionalProperties', None) not in (None, False): self.is_strict_compatible = False else: # additional properties are disallowed by default schema['additionalProperties'] = False if 'properties' not in schema or 'required' not in schema: self.is_strict_compatible = False else: required = schema['required'] for k in schema['properties'].keys(): if k not in required: self.is_strict_compatible = False return schema ``` ### anthropic_model_profile ```python anthropic_model_profile( model_name: str, ) -> ModelProfile | None ``` Get the model profile for an Anthropic model. Source code in `pydantic_ai_slim/pydantic_ai/profiles/anthropic.py` ```python def anthropic_model_profile(model_name: str) -> ModelProfile | None: """Get the model profile for an Anthropic model.""" return ModelProfile(thinking_tags=('', '')) ``` ### google_model_profile ```python google_model_profile( model_name: str, ) -> ModelProfile | None ``` Get the model profile for a Google model. Source code in `pydantic_ai_slim/pydantic_ai/profiles/google.py` ```python def google_model_profile(model_name: str) -> ModelProfile | None: """Get the model profile for a Google model.""" return ModelProfile( json_schema_transformer=GoogleJsonSchemaTransformer, supports_json_schema_output=True, supports_json_object_output=True, ) ``` ### GoogleJsonSchemaTransformer Bases: `JsonSchemaTransformer` Transforms the JSON Schema from Pydantic to be suitable for Gemini. Gemini which [supports](https://ai.google.dev/gemini-api/docs/function-calling#function_declarations) a subset of OpenAPI v3.0.3. Specifically: * gemini doesn't allow the `title` keyword to be set * gemini doesn't allow `$defs` — we need to inline the definitions where possible Source code in `pydantic_ai_slim/pydantic_ai/profiles/google.py` ```python class GoogleJsonSchemaTransformer(JsonSchemaTransformer): """Transforms the JSON Schema from Pydantic to be suitable for Gemini. Gemini which [supports](https://ai.google.dev/gemini-api/docs/function-calling#function_declarations) a subset of OpenAPI v3.0.3. Specifically: * gemini doesn't allow the `title` keyword to be set * gemini doesn't allow `$defs` — we need to inline the definitions where possible """ def __init__(self, schema: JsonSchema, *, strict: bool | None = None): super().__init__(schema, strict=strict, prefer_inlined_defs=True, simplify_nullable_unions=True) def transform(self, schema: JsonSchema) -> JsonSchema: # Note: we need to remove `additionalProperties: False` since it is currently mishandled by Gemini additional_properties = schema.pop( 'additionalProperties', None ) # don't pop yet so it's included in the warning if additional_properties: original_schema = {**schema, 'additionalProperties': additional_properties} warnings.warn( '`additionalProperties` is not supported by Gemini; it will be removed from the tool JSON schema.' f' Full schema: {self.schema}\n\n' f'Source of additionalProperties within the full schema: {original_schema}\n\n' 'If this came from a field with a type like `dict[str, MyType]`, that field will always be empty.\n\n' "If Google's APIs are updated to support this properly, please create an issue on the Pydantic AI GitHub" ' and we will fix this behavior.', UserWarning, ) schema.pop('title', None) schema.pop('$schema', None) if (const := schema.pop('const', None)) is not None: # Gemini doesn't support const, but it does support enum with a single value schema['enum'] = [const] schema.pop('discriminator', None) schema.pop('examples', None) # TODO: Should we use the trick from pydantic_ai.models.openai._OpenAIJsonSchema # where we add notes about these properties to the field description? schema.pop('exclusiveMaximum', None) schema.pop('exclusiveMinimum', None) # Gemini only supports string enums, so we need to convert any enum values to strings. # Pydantic will take care of transforming the transformed string values to the correct type. if enum := schema.get('enum'): schema['type'] = 'string' schema['enum'] = [str(val) for val in enum] type_ = schema.get('type') if 'oneOf' in schema and 'type' not in schema: # pragma: no cover # This gets hit when we have a discriminated union # Gemini returns an API error in this case even though it says in its error message it shouldn't... # Changing the oneOf to an anyOf prevents the API error and I think is functionally equivalent schema['anyOf'] = schema.pop('oneOf') if type_ == 'string' and (fmt := schema.pop('format', None)): description = schema.get('description') if description: schema['description'] = f'{description} (format: {fmt})' else: schema['description'] = f'Format: {fmt}' if '$ref' in schema: raise UserError(f'Recursive `$ref`s in JSON Schema are not supported by Gemini: {schema["$ref"]}') if 'prefixItems' in schema: # prefixItems is not currently supported in Gemini, so we convert it to items for best compatibility prefix_items = schema.pop('prefixItems') items = schema.get('items') unique_items = [items] if items is not None else [] for item in prefix_items: if item not in unique_items: unique_items.append(item) if len(unique_items) > 1: # pragma: no cover schema['items'] = {'anyOf': unique_items} elif len(unique_items) == 1: # pragma: no branch schema['items'] = unique_items[0] schema.setdefault('minItems', len(prefix_items)) if items is None: # pragma: no branch schema.setdefault('maxItems', len(prefix_items)) return schema ``` ### meta_model_profile ```python meta_model_profile(model_name: str) -> ModelProfile | None ``` Get the model profile for a Meta model. Source code in `pydantic_ai_slim/pydantic_ai/profiles/meta.py` ```python def meta_model_profile(model_name: str) -> ModelProfile | None: """Get the model profile for a Meta model.""" return ModelProfile(json_schema_transformer=InlineDefsJsonSchemaTransformer) ``` ### amazon_model_profile ```python amazon_model_profile( model_name: str, ) -> ModelProfile | None ``` Get the model profile for an Amazon model. Source code in `pydantic_ai_slim/pydantic_ai/profiles/amazon.py` ```python def amazon_model_profile(model_name: str) -> ModelProfile | None: """Get the model profile for an Amazon model.""" return ModelProfile(json_schema_transformer=InlineDefsJsonSchemaTransformer) ``` ### deepseek_model_profile ```python deepseek_model_profile( model_name: str, ) -> ModelProfile | None ``` Get the model profile for a DeepSeek model. Source code in `pydantic_ai_slim/pydantic_ai/profiles/deepseek.py` ```python def deepseek_model_profile(model_name: str) -> ModelProfile | None: """Get the model profile for a DeepSeek model.""" return ModelProfile(ignore_streamed_leading_whitespace='r1' in model_name) ``` ### grok_model_profile ```python grok_model_profile(model_name: str) -> ModelProfile | None ``` Get the model profile for a Grok model. Source code in `pydantic_ai_slim/pydantic_ai/profiles/grok.py` ```python def grok_model_profile(model_name: str) -> ModelProfile | None: """Get the model profile for a Grok model.""" return None ``` ### mistral_model_profile ```python mistral_model_profile( model_name: str, ) -> ModelProfile | None ``` Get the model profile for a Mistral model. Source code in `pydantic_ai_slim/pydantic_ai/profiles/mistral.py` ```python def mistral_model_profile(model_name: str) -> ModelProfile | None: """Get the model profile for a Mistral model.""" return None ``` ### qwen_model_profile ```python qwen_model_profile(model_name: str) -> ModelProfile | None ``` Get the model profile for a Qwen model. Source code in `pydantic_ai_slim/pydantic_ai/profiles/qwen.py` ```python def qwen_model_profile(model_name: str) -> ModelProfile | None: """Get the model profile for a Qwen model.""" if model_name.startswith('qwen-3-coder'): return OpenAIModelProfile( json_schema_transformer=InlineDefsJsonSchemaTransformer, openai_supports_tool_choice_required=False, openai_supports_strict_tool_definition=False, ignore_streamed_leading_whitespace=True, ) return ModelProfile( json_schema_transformer=InlineDefsJsonSchemaTransformer, ignore_streamed_leading_whitespace=True, ) ``` # `pydantic_ai.providers` Bases: `ABC`, `Generic[InterfaceClient]` Abstract class for a provider. The provider is in charge of providing an authenticated client to the API. Each provider only supports a specific interface. A interface can be supported by multiple providers. For example, the `OpenAIChatModel` interface can be supported by the `OpenAIProvider` and the `DeepSeekProvider`. Source code in `pydantic_ai_slim/pydantic_ai/providers/__init__.py` ```python class Provider(ABC, Generic[InterfaceClient]): """Abstract class for a provider. The provider is in charge of providing an authenticated client to the API. Each provider only supports a specific interface. A interface can be supported by multiple providers. For example, the `OpenAIChatModel` interface can be supported by the `OpenAIProvider` and the `DeepSeekProvider`. """ _client: InterfaceClient @property @abstractmethod def name(self) -> str: """The provider name.""" raise NotImplementedError() @property @abstractmethod def base_url(self) -> str: """The base URL for the provider API.""" raise NotImplementedError() @property @abstractmethod def client(self) -> InterfaceClient: """The client for the provider.""" raise NotImplementedError() def model_profile(self, model_name: str) -> ModelProfile | None: """The model profile for the named model, if available.""" return None # pragma: no cover def __repr__(self) -> str: return f'{self.__class__.__name__}(name={self.name}, base_url={self.base_url})' ``` ### name ```python name: str ``` The provider name. ### base_url ```python base_url: str ``` The base URL for the provider API. ### client ```python client: InterfaceClient ``` The client for the provider. ### model_profile ```python model_profile(model_name: str) -> ModelProfile | None ``` The model profile for the named model, if available. Source code in `pydantic_ai_slim/pydantic_ai/providers/__init__.py` ```python def model_profile(self, model_name: str) -> ModelProfile | None: """The model profile for the named model, if available.""" return None # pragma: no cover ``` ### GoogleProvider Bases: `Provider[Client]` Provider for Google. Source code in `pydantic_ai_slim/pydantic_ai/providers/google.py` ```python class GoogleProvider(Provider[Client]): """Provider for Google.""" @property def name(self) -> str: return 'google-vertex' if self._client._api_client.vertexai else 'google-gla' # type: ignore[reportPrivateUsage] @property def base_url(self) -> str: return str(self._client._api_client._http_options.base_url) # type: ignore[reportPrivateUsage] @property def client(self) -> Client: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: return google_model_profile(model_name) @overload def __init__(self, *, api_key: str) -> None: ... @overload def __init__( self, *, credentials: Credentials | None = None, project: str | None = None, location: VertexAILocation | Literal['global'] | None = None, ) -> None: ... @overload def __init__(self, *, client: Client) -> None: ... @overload def __init__(self, *, vertexai: bool = False) -> None: ... def __init__( self, *, api_key: str | None = None, credentials: Credentials | None = None, project: str | None = None, location: VertexAILocation | Literal['global'] | None = None, client: Client | None = None, vertexai: bool | None = None, ) -> None: """Create a new Google provider. Args: api_key: The `API key `_ to use for authentication. It can also be set via the `GOOGLE_API_KEY` environment variable. Applies to the Gemini Developer API only. credentials: The credentials to use for authentication when calling the Vertex AI APIs. Credentials can be obtained from environment variables and default credentials. For more information, see Set up Application Default Credentials. Applies to the Vertex AI API only. project: The Google Cloud project ID to use for quota. Can be obtained from environment variables (for example, GOOGLE_CLOUD_PROJECT). Applies to the Vertex AI API only. location: The location to send API requests to (for example, us-central1). Can be obtained from environment variables. Applies to the Vertex AI API only. client: A pre-initialized client to use. vertexai: Force the use of the Vertex AI API. If `False`, the Google Generative Language API will be used. Defaults to `False`. """ if client is None: # NOTE: We are keeping GEMINI_API_KEY for backwards compatibility. api_key = api_key or os.getenv('GOOGLE_API_KEY') or os.getenv('GEMINI_API_KEY') if vertexai is None: vertexai = bool(location or project or credentials) http_options: HttpOptionsDict = { 'headers': {'User-Agent': get_user_agent()}, 'async_client_args': {'transport': httpx.AsyncHTTPTransport()}, } if not vertexai: if api_key is None: raise UserError( # pragma: no cover 'Set the `GOOGLE_API_KEY` environment variable or pass it via `GoogleProvider(api_key=...)`' 'to use the Google Generative Language API.' ) self._client = Client(vertexai=vertexai, api_key=api_key, http_options=http_options) else: self._client = Client( vertexai=vertexai, project=project or os.getenv('GOOGLE_CLOUD_PROJECT'), # From https://github.com/pydantic/pydantic-ai/pull/2031/files#r2169682149: # Currently `us-central1` supports the most models by far of any region including `global`, but not # all of them. `us-central1` has all google models but is missing some Anthropic partner models, # which use `us-east5` instead. `global` has fewer models but higher availability. # For more details, check: https://cloud.google.com/vertex-ai/generative-ai/docs/learn/locations#available-regions location=location or os.getenv('GOOGLE_CLOUD_LOCATION') or 'us-central1', credentials=credentials, http_options=http_options, ) else: self._client = client ``` #### __init__ ```python __init__(*, api_key: str) -> None ``` ```python __init__( *, credentials: Credentials | None = None, project: str | None = None, location: ( VertexAILocation | Literal["global"] | None ) = None ) -> None ``` ```python __init__(*, client: Client) -> None ``` ```python __init__(*, vertexai: bool = False) -> None ``` ```python __init__( *, api_key: str | None = None, credentials: Credentials | None = None, project: str | None = None, location: ( VertexAILocation | Literal["global"] | None ) = None, client: Client | None = None, vertexai: bool | None = None ) -> None ``` Create a new Google provider. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `api_key` | `str | None` | The API key \_ to use for authentication. It can also be set via the GOOGLE_API_KEY environment variable. Applies to the Gemini Developer API only. | `None` | | `credentials` | `Credentials | None` | The credentials to use for authentication when calling the Vertex AI APIs. Credentials can be obtained from environment variables and default credentials. For more information, see Set up Application Default Credentials. Applies to the Vertex AI API only. | `None` | | `project` | `str | None` | The Google Cloud project ID to use for quota. Can be obtained from environment variables (for example, GOOGLE_CLOUD_PROJECT). Applies to the Vertex AI API only. | `None` | | `location` | `VertexAILocation | Literal['global'] | None` | The location to send API requests to (for example, us-central1). Can be obtained from environment variables. Applies to the Vertex AI API only. | `None` | | `client` | `Client | None` | A pre-initialized client to use. | `None` | | `vertexai` | `bool | None` | Force the use of the Vertex AI API. If False, the Google Generative Language API will be used. Defaults to False. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/providers/google.py` ```python def __init__( self, *, api_key: str | None = None, credentials: Credentials | None = None, project: str | None = None, location: VertexAILocation | Literal['global'] | None = None, client: Client | None = None, vertexai: bool | None = None, ) -> None: """Create a new Google provider. Args: api_key: The `API key `_ to use for authentication. It can also be set via the `GOOGLE_API_KEY` environment variable. Applies to the Gemini Developer API only. credentials: The credentials to use for authentication when calling the Vertex AI APIs. Credentials can be obtained from environment variables and default credentials. For more information, see Set up Application Default Credentials. Applies to the Vertex AI API only. project: The Google Cloud project ID to use for quota. Can be obtained from environment variables (for example, GOOGLE_CLOUD_PROJECT). Applies to the Vertex AI API only. location: The location to send API requests to (for example, us-central1). Can be obtained from environment variables. Applies to the Vertex AI API only. client: A pre-initialized client to use. vertexai: Force the use of the Vertex AI API. If `False`, the Google Generative Language API will be used. Defaults to `False`. """ if client is None: # NOTE: We are keeping GEMINI_API_KEY for backwards compatibility. api_key = api_key or os.getenv('GOOGLE_API_KEY') or os.getenv('GEMINI_API_KEY') if vertexai is None: vertexai = bool(location or project or credentials) http_options: HttpOptionsDict = { 'headers': {'User-Agent': get_user_agent()}, 'async_client_args': {'transport': httpx.AsyncHTTPTransport()}, } if not vertexai: if api_key is None: raise UserError( # pragma: no cover 'Set the `GOOGLE_API_KEY` environment variable or pass it via `GoogleProvider(api_key=...)`' 'to use the Google Generative Language API.' ) self._client = Client(vertexai=vertexai, api_key=api_key, http_options=http_options) else: self._client = Client( vertexai=vertexai, project=project or os.getenv('GOOGLE_CLOUD_PROJECT'), # From https://github.com/pydantic/pydantic-ai/pull/2031/files#r2169682149: # Currently `us-central1` supports the most models by far of any region including `global`, but not # all of them. `us-central1` has all google models but is missing some Anthropic partner models, # which use `us-east5` instead. `global` has fewer models but higher availability. # For more details, check: https://cloud.google.com/vertex-ai/generative-ai/docs/learn/locations#available-regions location=location or os.getenv('GOOGLE_CLOUD_LOCATION') or 'us-central1', credentials=credentials, http_options=http_options, ) else: self._client = client ``` ### VertexAILocation ```python VertexAILocation = Literal[ "asia-east1", "asia-east2", "asia-northeast1", "asia-northeast3", "asia-south1", "asia-southeast1", "australia-southeast1", "europe-central2", "europe-north1", "europe-southwest1", "europe-west1", "europe-west2", "europe-west3", "europe-west4", "europe-west6", "europe-west8", "europe-west9", "me-central1", "me-central2", "me-west1", "northamerica-northeast1", "southamerica-east1", "us-central1", "us-east1", "us-east4", "us-east5", "us-south1", "us-west1", "us-west4", ] ``` Regions available for Vertex AI. More details [here](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/locations#genai-locations). ### OpenAIProvider Bases: `Provider[AsyncOpenAI]` Provider for OpenAI API. Source code in `pydantic_ai_slim/pydantic_ai/providers/openai.py` ```python class OpenAIProvider(Provider[AsyncOpenAI]): """Provider for OpenAI API.""" @property def name(self) -> str: return 'openai' @property def base_url(self) -> str: return str(self.client.base_url) @property def client(self) -> AsyncOpenAI: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: return openai_model_profile(model_name) @overload def __init__(self, *, openai_client: AsyncOpenAI) -> None: ... @overload def __init__( self, base_url: str | None = None, api_key: str | None = None, openai_client: None = None, http_client: httpx.AsyncClient | None = None, ) -> None: ... def __init__( self, base_url: str | None = None, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: """Create a new OpenAI provider. Args: base_url: The base url for the OpenAI requests. If not provided, the `OPENAI_BASE_URL` environment variable will be used if available. Otherwise, defaults to OpenAI's base url. api_key: The API key to use for authentication, if not provided, the `OPENAI_API_KEY` environment variable will be used if available. openai_client: An existing [`AsyncOpenAI`](https://github.com/openai/openai-python?tab=readme-ov-file#async-usage) client to use. If provided, `base_url`, `api_key`, and `http_client` must be `None`. http_client: An existing `httpx.AsyncClient` to use for making HTTP requests. """ # This is a workaround for the OpenAI client requiring an API key, whilst locally served, # openai compatible models do not always need an API key, but a placeholder (non-empty) key is required. if api_key is None and 'OPENAI_API_KEY' not in os.environ and base_url is not None and openai_client is None: api_key = 'api-key-not-set' if openai_client is not None: assert base_url is None, 'Cannot provide both `openai_client` and `base_url`' assert http_client is None, 'Cannot provide both `openai_client` and `http_client`' assert api_key is None, 'Cannot provide both `openai_client` and `api_key`' self._client = openai_client elif http_client is not None: self._client = AsyncOpenAI(base_url=base_url, api_key=api_key, http_client=http_client) else: http_client = cached_async_http_client(provider='openai') self._client = AsyncOpenAI(base_url=base_url, api_key=api_key, http_client=http_client) ``` #### __init__ ```python __init__(*, openai_client: AsyncOpenAI) -> None ``` ```python __init__( base_url: str | None = None, api_key: str | None = None, openai_client: None = None, http_client: AsyncClient | None = None, ) -> None ``` ```python __init__( base_url: str | None = None, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: AsyncClient | None = None, ) -> None ``` Create a new OpenAI provider. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `base_url` | `str | None` | The base url for the OpenAI requests. If not provided, the OPENAI_BASE_URL environment variable will be used if available. Otherwise, defaults to OpenAI's base url. | `None` | | `api_key` | `str | None` | The API key to use for authentication, if not provided, the OPENAI_API_KEY environment variable will be used if available. | `None` | | `openai_client` | `AsyncOpenAI | None` | An existing AsyncOpenAI client to use. If provided, base_url, api_key, and http_client must be None. | `None` | | `http_client` | `AsyncClient | None` | An existing httpx.AsyncClient to use for making HTTP requests. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/providers/openai.py` ```python def __init__( self, base_url: str | None = None, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: """Create a new OpenAI provider. Args: base_url: The base url for the OpenAI requests. If not provided, the `OPENAI_BASE_URL` environment variable will be used if available. Otherwise, defaults to OpenAI's base url. api_key: The API key to use for authentication, if not provided, the `OPENAI_API_KEY` environment variable will be used if available. openai_client: An existing [`AsyncOpenAI`](https://github.com/openai/openai-python?tab=readme-ov-file#async-usage) client to use. If provided, `base_url`, `api_key`, and `http_client` must be `None`. http_client: An existing `httpx.AsyncClient` to use for making HTTP requests. """ # This is a workaround for the OpenAI client requiring an API key, whilst locally served, # openai compatible models do not always need an API key, but a placeholder (non-empty) key is required. if api_key is None and 'OPENAI_API_KEY' not in os.environ and base_url is not None and openai_client is None: api_key = 'api-key-not-set' if openai_client is not None: assert base_url is None, 'Cannot provide both `openai_client` and `base_url`' assert http_client is None, 'Cannot provide both `openai_client` and `http_client`' assert api_key is None, 'Cannot provide both `openai_client` and `api_key`' self._client = openai_client elif http_client is not None: self._client = AsyncOpenAI(base_url=base_url, api_key=api_key, http_client=http_client) else: http_client = cached_async_http_client(provider='openai') self._client = AsyncOpenAI(base_url=base_url, api_key=api_key, http_client=http_client) ``` ### DeepSeekProvider Bases: `Provider[AsyncOpenAI]` Provider for DeepSeek API. Source code in `pydantic_ai_slim/pydantic_ai/providers/deepseek.py` ```python class DeepSeekProvider(Provider[AsyncOpenAI]): """Provider for DeepSeek API.""" @property def name(self) -> str: return 'deepseek' @property def base_url(self) -> str: return 'https://api.deepseek.com' @property def client(self) -> AsyncOpenAI: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: profile = deepseek_model_profile(model_name) # As DeepSeekProvider is always used with OpenAIChatModel, which used to unconditionally use OpenAIJsonSchemaTransformer, # we need to maintain that behavior unless json_schema_transformer is set explicitly. # This was not the case when using a DeepSeek model with another model class (e.g. BedrockConverseModel or GroqModel), # so we won't do this in `deepseek_model_profile` unless we learn it's always needed. return OpenAIModelProfile(json_schema_transformer=OpenAIJsonSchemaTransformer).update(profile) @overload def __init__(self) -> None: ... @overload def __init__(self, *, api_key: str) -> None: ... @overload def __init__(self, *, api_key: str, http_client: httpx.AsyncClient) -> None: ... @overload def __init__(self, *, openai_client: AsyncOpenAI | None = None) -> None: ... def __init__( self, *, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: api_key = api_key or os.getenv('DEEPSEEK_API_KEY') if not api_key and openai_client is None: raise UserError( 'Set the `DEEPSEEK_API_KEY` environment variable or pass it via `DeepSeekProvider(api_key=...)`' 'to use the DeepSeek provider.' ) if openai_client is not None: self._client = openai_client elif http_client is not None: self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) else: http_client = cached_async_http_client(provider='deepseek') self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) ``` ### BedrockModelProfile Bases: `ModelProfile` Profile for models used with BedrockModel. ALL FIELDS MUST BE `bedrock_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. Source code in `pydantic_ai_slim/pydantic_ai/providers/bedrock.py` ```python @dataclass(kw_only=True) class BedrockModelProfile(ModelProfile): """Profile for models used with BedrockModel. ALL FIELDS MUST BE `bedrock_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. """ bedrock_supports_tool_choice: bool = False bedrock_tool_result_format: Literal['text', 'json'] = 'text' bedrock_send_back_thinking_parts: bool = False ``` ### bedrock_amazon_model_profile ```python bedrock_amazon_model_profile( model_name: str, ) -> ModelProfile | None ``` Get the model profile for an Amazon model used via Bedrock. Source code in `pydantic_ai_slim/pydantic_ai/providers/bedrock.py` ```python def bedrock_amazon_model_profile(model_name: str) -> ModelProfile | None: """Get the model profile for an Amazon model used via Bedrock.""" profile = amazon_model_profile(model_name) if 'nova' in model_name: return BedrockModelProfile(bedrock_supports_tool_choice=True).update(profile) return profile ``` ### bedrock_deepseek_model_profile ```python bedrock_deepseek_model_profile( model_name: str, ) -> ModelProfile | None ``` Get the model profile for a DeepSeek model used via Bedrock. Source code in `pydantic_ai_slim/pydantic_ai/providers/bedrock.py` ```python def bedrock_deepseek_model_profile(model_name: str) -> ModelProfile | None: """Get the model profile for a DeepSeek model used via Bedrock.""" profile = deepseek_model_profile(model_name) if 'r1' in model_name: return BedrockModelProfile(bedrock_send_back_thinking_parts=True).update(profile) return profile # pragma: no cover ``` ### BedrockProvider Bases: `Provider[BaseClient]` Provider for AWS Bedrock. Source code in `pydantic_ai_slim/pydantic_ai/providers/bedrock.py` ```python class BedrockProvider(Provider[BaseClient]): """Provider for AWS Bedrock.""" @property def name(self) -> str: return 'bedrock' @property def base_url(self) -> str: return self._client.meta.endpoint_url @property def client(self) -> BaseClient: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: provider_to_profile: dict[str, Callable[[str], ModelProfile | None]] = { 'anthropic': lambda model_name: BedrockModelProfile( bedrock_supports_tool_choice=True, bedrock_send_back_thinking_parts=True ).update(anthropic_model_profile(model_name)), 'mistral': lambda model_name: BedrockModelProfile(bedrock_tool_result_format='json').update( mistral_model_profile(model_name) ), 'cohere': cohere_model_profile, 'amazon': bedrock_amazon_model_profile, 'meta': meta_model_profile, 'deepseek': bedrock_deepseek_model_profile, } # Split the model name into parts parts = model_name.split('.', 2) # Handle regional prefixes (e.g. "us.") if len(parts) > 2 and len(parts[0]) == 2: parts = parts[1:] if len(parts) < 2: return None provider = parts[0] model_name_with_version = parts[1] # Remove version suffix if it matches the format (e.g. "-v1:0" or "-v14") version_match = re.match(r'(.+)-v\d+(?::\d+)?$', model_name_with_version) if version_match: model_name = version_match.group(1) else: model_name = model_name_with_version if provider in provider_to_profile: return provider_to_profile[provider](model_name) return None @overload def __init__(self, *, bedrock_client: BaseClient) -> None: ... @overload def __init__( self, *, region_name: str | None = None, aws_access_key_id: str | None = None, aws_secret_access_key: str | None = None, aws_session_token: str | None = None, profile_name: str | None = None, aws_read_timeout: float | None = None, aws_connect_timeout: float | None = None, ) -> None: ... def __init__( self, *, bedrock_client: BaseClient | None = None, region_name: str | None = None, aws_access_key_id: str | None = None, aws_secret_access_key: str | None = None, aws_session_token: str | None = None, profile_name: str | None = None, aws_read_timeout: float | None = None, aws_connect_timeout: float | None = None, ) -> None: """Initialize the Bedrock provider. Args: bedrock_client: A boto3 client for Bedrock Runtime. If provided, other arguments are ignored. region_name: The AWS region name. aws_access_key_id: The AWS access key ID. aws_secret_access_key: The AWS secret access key. aws_session_token: The AWS session token. profile_name: The AWS profile name. aws_read_timeout: The read timeout for Bedrock client. aws_connect_timeout: The connect timeout for Bedrock client. """ if bedrock_client is not None: self._client = bedrock_client else: try: read_timeout = aws_read_timeout or float(os.getenv('AWS_READ_TIMEOUT', 300)) connect_timeout = aws_connect_timeout or float(os.getenv('AWS_CONNECT_TIMEOUT', 60)) session = boto3.Session( aws_access_key_id=aws_access_key_id, aws_secret_access_key=aws_secret_access_key, aws_session_token=aws_session_token, region_name=region_name, profile_name=profile_name, ) self._client = session.client( # type: ignore[reportUnknownMemberType] 'bedrock-runtime', config=Config(read_timeout=read_timeout, connect_timeout=connect_timeout), ) except NoRegionError as exc: # pragma: no cover raise UserError('You must provide a `region_name` or a boto3 client for Bedrock Runtime.') from exc ``` #### __init__ ```python __init__(*, bedrock_client: BaseClient) -> None ``` ```python __init__( *, region_name: str | None = None, aws_access_key_id: str | None = None, aws_secret_access_key: str | None = None, aws_session_token: str | None = None, profile_name: str | None = None, aws_read_timeout: float | None = None, aws_connect_timeout: float | None = None ) -> None ``` ```python __init__( *, bedrock_client: BaseClient | None = None, region_name: str | None = None, aws_access_key_id: str | None = None, aws_secret_access_key: str | None = None, aws_session_token: str | None = None, profile_name: str | None = None, aws_read_timeout: float | None = None, aws_connect_timeout: float | None = None ) -> None ``` Initialize the Bedrock provider. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `bedrock_client` | `BaseClient | None` | A boto3 client for Bedrock Runtime. If provided, other arguments are ignored. | `None` | | `region_name` | `str | None` | The AWS region name. | `None` | | `aws_access_key_id` | `str | None` | The AWS access key ID. | `None` | | `aws_secret_access_key` | `str | None` | The AWS secret access key. | `None` | | `aws_session_token` | `str | None` | The AWS session token. | `None` | | `profile_name` | `str | None` | The AWS profile name. | `None` | | `aws_read_timeout` | `float | None` | The read timeout for Bedrock client. | `None` | | `aws_connect_timeout` | `float | None` | The connect timeout for Bedrock client. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/providers/bedrock.py` ```python def __init__( self, *, bedrock_client: BaseClient | None = None, region_name: str | None = None, aws_access_key_id: str | None = None, aws_secret_access_key: str | None = None, aws_session_token: str | None = None, profile_name: str | None = None, aws_read_timeout: float | None = None, aws_connect_timeout: float | None = None, ) -> None: """Initialize the Bedrock provider. Args: bedrock_client: A boto3 client for Bedrock Runtime. If provided, other arguments are ignored. region_name: The AWS region name. aws_access_key_id: The AWS access key ID. aws_secret_access_key: The AWS secret access key. aws_session_token: The AWS session token. profile_name: The AWS profile name. aws_read_timeout: The read timeout for Bedrock client. aws_connect_timeout: The connect timeout for Bedrock client. """ if bedrock_client is not None: self._client = bedrock_client else: try: read_timeout = aws_read_timeout or float(os.getenv('AWS_READ_TIMEOUT', 300)) connect_timeout = aws_connect_timeout or float(os.getenv('AWS_CONNECT_TIMEOUT', 60)) session = boto3.Session( aws_access_key_id=aws_access_key_id, aws_secret_access_key=aws_secret_access_key, aws_session_token=aws_session_token, region_name=region_name, profile_name=profile_name, ) self._client = session.client( # type: ignore[reportUnknownMemberType] 'bedrock-runtime', config=Config(read_timeout=read_timeout, connect_timeout=connect_timeout), ) except NoRegionError as exc: # pragma: no cover raise UserError('You must provide a `region_name` or a boto3 client for Bedrock Runtime.') from exc ``` ### groq_moonshotai_model_profile ```python groq_moonshotai_model_profile( model_name: str, ) -> ModelProfile | None ``` Get the model profile for an MoonshotAI model used with the Groq provider. Source code in `pydantic_ai_slim/pydantic_ai/providers/groq.py` ```python def groq_moonshotai_model_profile(model_name: str) -> ModelProfile | None: """Get the model profile for an MoonshotAI model used with the Groq provider.""" return ModelProfile(supports_json_object_output=True, supports_json_schema_output=True).update( moonshotai_model_profile(model_name) ) ``` ### meta_groq_model_profile ```python meta_groq_model_profile( model_name: str, ) -> ModelProfile | None ``` Get the model profile for a Meta model used with the Groq provider. Source code in `pydantic_ai_slim/pydantic_ai/providers/groq.py` ```python def meta_groq_model_profile(model_name: str) -> ModelProfile | None: """Get the model profile for a Meta model used with the Groq provider.""" if model_name in {'llama-4-maverick-17b-128e-instruct', 'llama-4-scout-17b-16e-instruct'}: return ModelProfile(supports_json_object_output=True, supports_json_schema_output=True).update( meta_model_profile(model_name) ) else: return meta_model_profile(model_name) ``` ### GroqProvider Bases: `Provider[AsyncGroq]` Provider for Groq API. Source code in `pydantic_ai_slim/pydantic_ai/providers/groq.py` ```python class GroqProvider(Provider[AsyncGroq]): """Provider for Groq API.""" @property def name(self) -> str: return 'groq' @property def base_url(self) -> str: return str(self.client.base_url) @property def client(self) -> AsyncGroq: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: prefix_to_profile = { 'llama': meta_model_profile, 'meta-llama/': meta_groq_model_profile, 'gemma': google_model_profile, 'qwen': qwen_model_profile, 'deepseek': deepseek_model_profile, 'mistral': mistral_model_profile, 'moonshotai/': groq_moonshotai_model_profile, 'compound-': groq_model_profile, 'openai/': openai_model_profile, } for prefix, profile_func in prefix_to_profile.items(): model_name = model_name.lower() if model_name.startswith(prefix): if prefix.endswith('/'): model_name = model_name[len(prefix) :] return profile_func(model_name) return None @overload def __init__(self, *, groq_client: AsyncGroq | None = None) -> None: ... @overload def __init__( self, *, api_key: str | None = None, base_url: str | None = None, http_client: httpx.AsyncClient | None = None ) -> None: ... def __init__( self, *, api_key: str | None = None, base_url: str | None = None, groq_client: AsyncGroq | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: """Create a new Groq provider. Args: api_key: The API key to use for authentication, if not provided, the `GROQ_API_KEY` environment variable will be used if available. base_url: The base url for the Groq requests. If not provided, the `GROQ_BASE_URL` environment variable will be used if available. Otherwise, defaults to Groq's base url. groq_client: An existing [`AsyncGroq`](https://github.com/groq/groq-python?tab=readme-ov-file#async-usage) client to use. If provided, `api_key` and `http_client` must be `None`. http_client: An existing `AsyncHTTPClient` to use for making HTTP requests. """ if groq_client is not None: assert http_client is None, 'Cannot provide both `groq_client` and `http_client`' assert api_key is None, 'Cannot provide both `groq_client` and `api_key`' assert base_url is None, 'Cannot provide both `groq_client` and `base_url`' self._client = groq_client else: api_key = api_key or os.getenv('GROQ_API_KEY') base_url = base_url or os.getenv('GROQ_BASE_URL', 'https://api.groq.com') if not api_key: raise UserError( 'Set the `GROQ_API_KEY` environment variable or pass it via `GroqProvider(api_key=...)`' 'to use the Groq provider.' ) elif http_client is not None: self._client = AsyncGroq(base_url=base_url, api_key=api_key, http_client=http_client) else: http_client = cached_async_http_client(provider='groq') self._client = AsyncGroq(base_url=base_url, api_key=api_key, http_client=http_client) ``` #### __init__ ```python __init__(*, groq_client: AsyncGroq | None = None) -> None ``` ```python __init__( *, api_key: str | None = None, base_url: str | None = None, http_client: AsyncClient | None = None ) -> None ``` ```python __init__( *, api_key: str | None = None, base_url: str | None = None, groq_client: AsyncGroq | None = None, http_client: AsyncClient | None = None ) -> None ``` Create a new Groq provider. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `api_key` | `str | None` | The API key to use for authentication, if not provided, the GROQ_API_KEY environment variable will be used if available. | `None` | | `base_url` | `str | None` | The base url for the Groq requests. If not provided, the GROQ_BASE_URL environment variable will be used if available. Otherwise, defaults to Groq's base url. | `None` | | `groq_client` | `AsyncGroq | None` | An existing AsyncGroq client to use. If provided, api_key and http_client must be None. | `None` | | `http_client` | `AsyncClient | None` | An existing AsyncHTTPClient to use for making HTTP requests. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/providers/groq.py` ```python def __init__( self, *, api_key: str | None = None, base_url: str | None = None, groq_client: AsyncGroq | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: """Create a new Groq provider. Args: api_key: The API key to use for authentication, if not provided, the `GROQ_API_KEY` environment variable will be used if available. base_url: The base url for the Groq requests. If not provided, the `GROQ_BASE_URL` environment variable will be used if available. Otherwise, defaults to Groq's base url. groq_client: An existing [`AsyncGroq`](https://github.com/groq/groq-python?tab=readme-ov-file#async-usage) client to use. If provided, `api_key` and `http_client` must be `None`. http_client: An existing `AsyncHTTPClient` to use for making HTTP requests. """ if groq_client is not None: assert http_client is None, 'Cannot provide both `groq_client` and `http_client`' assert api_key is None, 'Cannot provide both `groq_client` and `api_key`' assert base_url is None, 'Cannot provide both `groq_client` and `base_url`' self._client = groq_client else: api_key = api_key or os.getenv('GROQ_API_KEY') base_url = base_url or os.getenv('GROQ_BASE_URL', 'https://api.groq.com') if not api_key: raise UserError( 'Set the `GROQ_API_KEY` environment variable or pass it via `GroqProvider(api_key=...)`' 'to use the Groq provider.' ) elif http_client is not None: self._client = AsyncGroq(base_url=base_url, api_key=api_key, http_client=http_client) else: http_client = cached_async_http_client(provider='groq') self._client = AsyncGroq(base_url=base_url, api_key=api_key, http_client=http_client) ``` ### AzureProvider Bases: `Provider[AsyncOpenAI]` Provider for Azure OpenAI API. See for more information. Source code in `pydantic_ai_slim/pydantic_ai/providers/azure.py` ```python class AzureProvider(Provider[AsyncOpenAI]): """Provider for Azure OpenAI API. See for more information. """ @property def name(self) -> str: return 'azure' @property def base_url(self) -> str: assert self._base_url is not None return self._base_url @property def client(self) -> AsyncOpenAI: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: model_name = model_name.lower() prefix_to_profile = { 'llama': meta_model_profile, 'meta-': meta_model_profile, 'deepseek': deepseek_model_profile, 'mistralai-': mistral_model_profile, 'mistral': mistral_model_profile, 'cohere-': cohere_model_profile, 'grok': grok_model_profile, } for prefix, profile_func in prefix_to_profile.items(): if model_name.startswith(prefix): if prefix.endswith('-'): model_name = model_name[len(prefix) :] profile = profile_func(model_name) # As AzureProvider is always used with OpenAIChatModel, which used to unconditionally use OpenAIJsonSchemaTransformer, # we need to maintain that behavior unless json_schema_transformer is set explicitly return OpenAIModelProfile(json_schema_transformer=OpenAIJsonSchemaTransformer).update(profile) # OpenAI models are unprefixed return openai_model_profile(model_name) @overload def __init__(self, *, openai_client: AsyncAzureOpenAI) -> None: ... @overload def __init__( self, *, azure_endpoint: str | None = None, api_version: str | None = None, api_key: str | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: ... def __init__( self, *, azure_endpoint: str | None = None, api_version: str | None = None, api_key: str | None = None, openai_client: AsyncAzureOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: """Create a new Azure provider. Args: azure_endpoint: The Azure endpoint to use for authentication, if not provided, the `AZURE_OPENAI_ENDPOINT` environment variable will be used if available. api_version: The API version to use for authentication, if not provided, the `OPENAI_API_VERSION` environment variable will be used if available. api_key: The API key to use for authentication, if not provided, the `AZURE_OPENAI_API_KEY` environment variable will be used if available. openai_client: An existing [`AsyncAzureOpenAI`](https://github.com/openai/openai-python#microsoft-azure-openai) client to use. If provided, `base_url`, `api_key`, and `http_client` must be `None`. http_client: An existing `httpx.AsyncClient` to use for making HTTP requests. """ if openai_client is not None: assert azure_endpoint is None, 'Cannot provide both `openai_client` and `azure_endpoint`' assert http_client is None, 'Cannot provide both `openai_client` and `http_client`' assert api_key is None, 'Cannot provide both `openai_client` and `api_key`' self._base_url = str(openai_client.base_url) self._client = openai_client else: azure_endpoint = azure_endpoint or os.getenv('AZURE_OPENAI_ENDPOINT') if not azure_endpoint: raise UserError( 'Must provide one of the `azure_endpoint` argument or the `AZURE_OPENAI_ENDPOINT` environment variable' ) if not api_key and 'AZURE_OPENAI_API_KEY' not in os.environ: # pragma: no cover raise UserError( 'Must provide one of the `api_key` argument or the `AZURE_OPENAI_API_KEY` environment variable' ) if not api_version and 'OPENAI_API_VERSION' not in os.environ: # pragma: no cover raise UserError( 'Must provide one of the `api_version` argument or the `OPENAI_API_VERSION` environment variable' ) http_client = http_client or cached_async_http_client(provider='azure') self._client = AsyncAzureOpenAI( azure_endpoint=azure_endpoint, api_key=api_key, api_version=api_version, http_client=http_client, ) self._base_url = str(self._client.base_url) ``` #### __init__ ```python __init__(*, openai_client: AsyncAzureOpenAI) -> None ``` ```python __init__( *, azure_endpoint: str | None = None, api_version: str | None = None, api_key: str | None = None, http_client: AsyncClient | None = None ) -> None ``` ```python __init__( *, azure_endpoint: str | None = None, api_version: str | None = None, api_key: str | None = None, openai_client: AsyncAzureOpenAI | None = None, http_client: AsyncClient | None = None ) -> None ``` Create a new Azure provider. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `azure_endpoint` | `str | None` | The Azure endpoint to use for authentication, if not provided, the AZURE_OPENAI_ENDPOINT environment variable will be used if available. | `None` | | `api_version` | `str | None` | The API version to use for authentication, if not provided, the OPENAI_API_VERSION environment variable will be used if available. | `None` | | `api_key` | `str | None` | The API key to use for authentication, if not provided, the AZURE_OPENAI_API_KEY environment variable will be used if available. | `None` | | `openai_client` | `AsyncAzureOpenAI | None` | An existing AsyncAzureOpenAI client to use. If provided, base_url, api_key, and http_client must be None. | `None` | | `http_client` | `AsyncClient | None` | An existing httpx.AsyncClient to use for making HTTP requests. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/providers/azure.py` ```python def __init__( self, *, azure_endpoint: str | None = None, api_version: str | None = None, api_key: str | None = None, openai_client: AsyncAzureOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: """Create a new Azure provider. Args: azure_endpoint: The Azure endpoint to use for authentication, if not provided, the `AZURE_OPENAI_ENDPOINT` environment variable will be used if available. api_version: The API version to use for authentication, if not provided, the `OPENAI_API_VERSION` environment variable will be used if available. api_key: The API key to use for authentication, if not provided, the `AZURE_OPENAI_API_KEY` environment variable will be used if available. openai_client: An existing [`AsyncAzureOpenAI`](https://github.com/openai/openai-python#microsoft-azure-openai) client to use. If provided, `base_url`, `api_key`, and `http_client` must be `None`. http_client: An existing `httpx.AsyncClient` to use for making HTTP requests. """ if openai_client is not None: assert azure_endpoint is None, 'Cannot provide both `openai_client` and `azure_endpoint`' assert http_client is None, 'Cannot provide both `openai_client` and `http_client`' assert api_key is None, 'Cannot provide both `openai_client` and `api_key`' self._base_url = str(openai_client.base_url) self._client = openai_client else: azure_endpoint = azure_endpoint or os.getenv('AZURE_OPENAI_ENDPOINT') if not azure_endpoint: raise UserError( 'Must provide one of the `azure_endpoint` argument or the `AZURE_OPENAI_ENDPOINT` environment variable' ) if not api_key and 'AZURE_OPENAI_API_KEY' not in os.environ: # pragma: no cover raise UserError( 'Must provide one of the `api_key` argument or the `AZURE_OPENAI_API_KEY` environment variable' ) if not api_version and 'OPENAI_API_VERSION' not in os.environ: # pragma: no cover raise UserError( 'Must provide one of the `api_version` argument or the `OPENAI_API_VERSION` environment variable' ) http_client = http_client or cached_async_http_client(provider='azure') self._client = AsyncAzureOpenAI( azure_endpoint=azure_endpoint, api_key=api_key, api_version=api_version, http_client=http_client, ) self._base_url = str(self._client.base_url) ``` ### CohereProvider Bases: `Provider[AsyncClientV2]` Provider for Cohere API. Source code in `pydantic_ai_slim/pydantic_ai/providers/cohere.py` ```python class CohereProvider(Provider[AsyncClientV2]): """Provider for Cohere API.""" @property def name(self) -> str: return 'cohere' @property def base_url(self) -> str: client_wrapper = self.client._client_wrapper # type: ignore return str(client_wrapper.get_base_url()) @property def client(self) -> AsyncClientV2: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: return cohere_model_profile(model_name) def __init__( self, *, api_key: str | None = None, cohere_client: AsyncClientV2 | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: """Create a new Cohere provider. Args: api_key: The API key to use for authentication, if not provided, the `CO_API_KEY` environment variable will be used if available. cohere_client: An existing [AsyncClientV2](https://github.com/cohere-ai/cohere-python) client to use. If provided, `api_key` and `http_client` must be `None`. http_client: An existing `httpx.AsyncClient` to use for making HTTP requests. """ if cohere_client is not None: assert http_client is None, 'Cannot provide both `cohere_client` and `http_client`' assert api_key is None, 'Cannot provide both `cohere_client` and `api_key`' self._client = cohere_client else: api_key = api_key or os.getenv('CO_API_KEY') if not api_key: raise UserError( 'Set the `CO_API_KEY` environment variable or pass it via `CohereProvider(api_key=...)`' 'to use the Cohere provider.' ) base_url = os.getenv('CO_BASE_URL') if http_client is not None: self._client = AsyncClientV2(api_key=api_key, httpx_client=http_client, base_url=base_url) else: http_client = cached_async_http_client(provider='cohere') self._client = AsyncClientV2(api_key=api_key, httpx_client=http_client, base_url=base_url) ``` #### __init__ ```python __init__( *, api_key: str | None = None, cohere_client: AsyncClientV2 | None = None, http_client: AsyncClient | None = None ) -> None ``` Create a new Cohere provider. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `api_key` | `str | None` | The API key to use for authentication, if not provided, the CO_API_KEY environment variable will be used if available. | `None` | | `cohere_client` | `AsyncClientV2 | None` | An existing AsyncClientV2 client to use. If provided, api_key and http_client must be None. | `None` | | `http_client` | `AsyncClient | None` | An existing httpx.AsyncClient to use for making HTTP requests. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/providers/cohere.py` ```python def __init__( self, *, api_key: str | None = None, cohere_client: AsyncClientV2 | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: """Create a new Cohere provider. Args: api_key: The API key to use for authentication, if not provided, the `CO_API_KEY` environment variable will be used if available. cohere_client: An existing [AsyncClientV2](https://github.com/cohere-ai/cohere-python) client to use. If provided, `api_key` and `http_client` must be `None`. http_client: An existing `httpx.AsyncClient` to use for making HTTP requests. """ if cohere_client is not None: assert http_client is None, 'Cannot provide both `cohere_client` and `http_client`' assert api_key is None, 'Cannot provide both `cohere_client` and `api_key`' self._client = cohere_client else: api_key = api_key or os.getenv('CO_API_KEY') if not api_key: raise UserError( 'Set the `CO_API_KEY` environment variable or pass it via `CohereProvider(api_key=...)`' 'to use the Cohere provider.' ) base_url = os.getenv('CO_BASE_URL') if http_client is not None: self._client = AsyncClientV2(api_key=api_key, httpx_client=http_client, base_url=base_url) else: http_client = cached_async_http_client(provider='cohere') self._client = AsyncClientV2(api_key=api_key, httpx_client=http_client, base_url=base_url) ``` Bases: `Provider[AsyncOpenAI]` Provider for Cerebras API. Source code in `pydantic_ai_slim/pydantic_ai/providers/cerebras.py` ```python class CerebrasProvider(Provider[AsyncOpenAI]): """Provider for Cerebras API.""" @property def name(self) -> str: return 'cerebras' @property def base_url(self) -> str: return 'https://api.cerebras.ai/v1' @property def client(self) -> AsyncOpenAI: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: prefix_to_profile = {'llama': meta_model_profile, 'qwen': qwen_model_profile, 'gpt-oss': harmony_model_profile} profile = None for prefix, profile_func in prefix_to_profile.items(): model_name = model_name.lower() if model_name.startswith(prefix): profile = profile_func(model_name) # According to https://inference-docs.cerebras.ai/resources/openai#currently-unsupported-openai-features, # Cerebras doesn't support some model settings. unsupported_model_settings = ( 'frequency_penalty', 'logit_bias', 'presence_penalty', 'parallel_tool_calls', 'service_tier', ) return OpenAIModelProfile( json_schema_transformer=OpenAIJsonSchemaTransformer, openai_unsupported_model_settings=unsupported_model_settings, ).update(profile) @overload def __init__(self) -> None: ... @overload def __init__(self, *, api_key: str) -> None: ... @overload def __init__(self, *, api_key: str, http_client: httpx.AsyncClient) -> None: ... @overload def __init__(self, *, openai_client: AsyncOpenAI | None = None) -> None: ... def __init__( self, *, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: api_key = api_key or os.getenv('CEREBRAS_API_KEY') if not api_key and openai_client is None: raise UserError( 'Set the `CEREBRAS_API_KEY` environment variable or pass it via `CerebrasProvider(api_key=...)` ' 'to use the Cerebras provider.' ) if openai_client is not None: self._client = openai_client elif http_client is not None: self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) else: http_client = cached_async_http_client(provider='cerebras') self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) ``` Bases: `Provider[Mistral]` Provider for Mistral API. Source code in `pydantic_ai_slim/pydantic_ai/providers/mistral.py` ```python class MistralProvider(Provider[Mistral]): """Provider for Mistral API.""" @property def name(self) -> str: return 'mistral' @property def base_url(self) -> str: return self.client.sdk_configuration.get_server_details()[0] @property def client(self) -> Mistral: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: return mistral_model_profile(model_name) @overload def __init__(self, *, mistral_client: Mistral | None = None) -> None: ... @overload def __init__(self, *, api_key: str | None = None, http_client: httpx.AsyncClient | None = None) -> None: ... def __init__( self, *, api_key: str | None = None, mistral_client: Mistral | None = None, base_url: str | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: """Create a new Mistral provider. Args: api_key: The API key to use for authentication, if not provided, the `MISTRAL_API_KEY` environment variable will be used if available. mistral_client: An existing `Mistral` client to use, if provided, `api_key` and `http_client` must be `None`. base_url: The base url for the Mistral requests. http_client: An existing async client to use for making HTTP requests. """ if mistral_client is not None: assert http_client is None, 'Cannot provide both `mistral_client` and `http_client`' assert api_key is None, 'Cannot provide both `mistral_client` and `api_key`' assert base_url is None, 'Cannot provide both `mistral_client` and `base_url`' self._client = mistral_client else: api_key = api_key or os.getenv('MISTRAL_API_KEY') if not api_key: raise UserError( 'Set the `MISTRAL_API_KEY` environment variable or pass it via `MistralProvider(api_key=...)`' 'to use the Mistral provider.' ) elif http_client is not None: self._client = Mistral(api_key=api_key, async_client=http_client, server_url=base_url) else: http_client = cached_async_http_client(provider='mistral') self._client = Mistral(api_key=api_key, async_client=http_client, server_url=base_url) ``` ### __init__ ```python __init__(*, mistral_client: Mistral | None = None) -> None ``` ```python __init__( *, api_key: str | None = None, http_client: AsyncClient | None = None ) -> None ``` ```python __init__( *, api_key: str | None = None, mistral_client: Mistral | None = None, base_url: str | None = None, http_client: AsyncClient | None = None ) -> None ``` Create a new Mistral provider. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `api_key` | `str | None` | The API key to use for authentication, if not provided, the MISTRAL_API_KEY environment variable will be used if available. | `None` | | `mistral_client` | `Mistral | None` | An existing Mistral client to use, if provided, api_key and http_client must be None. | `None` | | `base_url` | `str | None` | The base url for the Mistral requests. | `None` | | `http_client` | `AsyncClient | None` | An existing async client to use for making HTTP requests. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/providers/mistral.py` ```python def __init__( self, *, api_key: str | None = None, mistral_client: Mistral | None = None, base_url: str | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: """Create a new Mistral provider. Args: api_key: The API key to use for authentication, if not provided, the `MISTRAL_API_KEY` environment variable will be used if available. mistral_client: An existing `Mistral` client to use, if provided, `api_key` and `http_client` must be `None`. base_url: The base url for the Mistral requests. http_client: An existing async client to use for making HTTP requests. """ if mistral_client is not None: assert http_client is None, 'Cannot provide both `mistral_client` and `http_client`' assert api_key is None, 'Cannot provide both `mistral_client` and `api_key`' assert base_url is None, 'Cannot provide both `mistral_client` and `base_url`' self._client = mistral_client else: api_key = api_key or os.getenv('MISTRAL_API_KEY') if not api_key: raise UserError( 'Set the `MISTRAL_API_KEY` environment variable or pass it via `MistralProvider(api_key=...)`' 'to use the Mistral provider.' ) elif http_client is not None: self._client = Mistral(api_key=api_key, async_client=http_client, server_url=base_url) else: http_client = cached_async_http_client(provider='mistral') self._client = Mistral(api_key=api_key, async_client=http_client, server_url=base_url) ``` Bases: `Provider[AsyncOpenAI]` Provider for Fireworks AI API. Source code in `pydantic_ai_slim/pydantic_ai/providers/fireworks.py` ```python class FireworksProvider(Provider[AsyncOpenAI]): """Provider for Fireworks AI API.""" @property def name(self) -> str: return 'fireworks' @property def base_url(self) -> str: return 'https://api.fireworks.ai/inference/v1' @property def client(self) -> AsyncOpenAI: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: prefix_to_profile = { 'llama': meta_model_profile, 'qwen': qwen_model_profile, 'deepseek': deepseek_model_profile, 'mistral': mistral_model_profile, 'gemma': google_model_profile, } prefix = 'accounts/fireworks/models/' profile = None if model_name.startswith(prefix): model_name = model_name[len(prefix) :] for provider, profile_func in prefix_to_profile.items(): if model_name.startswith(provider): profile = profile_func(model_name) break # As the Fireworks API is OpenAI-compatible, let's assume we also need OpenAIJsonSchemaTransformer, # unless json_schema_transformer is set explicitly return OpenAIModelProfile(json_schema_transformer=OpenAIJsonSchemaTransformer).update(profile) @overload def __init__(self) -> None: ... @overload def __init__(self, *, api_key: str) -> None: ... @overload def __init__(self, *, api_key: str, http_client: httpx.AsyncClient) -> None: ... @overload def __init__(self, *, openai_client: AsyncOpenAI | None = None) -> None: ... def __init__( self, *, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: api_key = api_key or os.getenv('FIREWORKS_API_KEY') if not api_key and openai_client is None: raise UserError( 'Set the `FIREWORKS_API_KEY` environment variable or pass it via `FireworksProvider(api_key=...)`' 'to use the Fireworks AI provider.' ) if openai_client is not None: self._client = openai_client elif http_client is not None: self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) else: http_client = cached_async_http_client(provider='fireworks') self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) ``` Bases: `Provider[AsyncOpenAI]` Provider for Grok API. Source code in `pydantic_ai_slim/pydantic_ai/providers/grok.py` ```python class GrokProvider(Provider[AsyncOpenAI]): """Provider for Grok API.""" @property def name(self) -> str: return 'grok' @property def base_url(self) -> str: return 'https://api.x.ai/v1' @property def client(self) -> AsyncOpenAI: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: profile = grok_model_profile(model_name) # As the Grok API is OpenAI-compatible, let's assume we also need OpenAIJsonSchemaTransformer, # unless json_schema_transformer is set explicitly. # Also, Grok does not support strict tool definitions: https://github.com/pydantic/pydantic-ai/issues/1846 return OpenAIModelProfile( json_schema_transformer=OpenAIJsonSchemaTransformer, openai_supports_strict_tool_definition=False ).update(profile) @overload def __init__(self) -> None: ... @overload def __init__(self, *, api_key: str) -> None: ... @overload def __init__(self, *, api_key: str, http_client: httpx.AsyncClient) -> None: ... @overload def __init__(self, *, openai_client: AsyncOpenAI | None = None) -> None: ... def __init__( self, *, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: api_key = api_key or os.getenv('GROK_API_KEY') if not api_key and openai_client is None: raise UserError( 'Set the `GROK_API_KEY` environment variable or pass it via `GrokProvider(api_key=...)`' 'to use the Grok provider.' ) if openai_client is not None: self._client = openai_client elif http_client is not None: self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) else: http_client = cached_async_http_client(provider='grok') self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) ``` Bases: `Provider[AsyncOpenAI]` Provider for Together AI API. Source code in `pydantic_ai_slim/pydantic_ai/providers/together.py` ```python class TogetherProvider(Provider[AsyncOpenAI]): """Provider for Together AI API.""" @property def name(self) -> str: return 'together' @property def base_url(self) -> str: return 'https://api.together.xyz/v1' @property def client(self) -> AsyncOpenAI: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: provider_to_profile = { 'deepseek-ai': deepseek_model_profile, 'google': google_model_profile, 'qwen': qwen_model_profile, 'meta-llama': meta_model_profile, 'mistralai': mistral_model_profile, } profile = None model_name = model_name.lower() provider, model_name = model_name.split('/', 1) if provider in provider_to_profile: profile = provider_to_profile[provider](model_name) # As the Together API is OpenAI-compatible, let's assume we also need OpenAIJsonSchemaTransformer, # unless json_schema_transformer is set explicitly return OpenAIModelProfile(json_schema_transformer=OpenAIJsonSchemaTransformer).update(profile) @overload def __init__(self) -> None: ... @overload def __init__(self, *, api_key: str) -> None: ... @overload def __init__(self, *, api_key: str, http_client: httpx.AsyncClient) -> None: ... @overload def __init__(self, *, openai_client: AsyncOpenAI | None = None) -> None: ... def __init__( self, *, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: api_key = api_key or os.getenv('TOGETHER_API_KEY') if not api_key and openai_client is None: raise UserError( 'Set the `TOGETHER_API_KEY` environment variable or pass it via `TogetherProvider(api_key=...)`' 'to use the Together AI provider.' ) if openai_client is not None: self._client = openai_client elif http_client is not None: self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) else: http_client = cached_async_http_client(provider='together') self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) ``` Bases: `Provider[AsyncOpenAI]` Provider for Heroku API. Source code in `pydantic_ai_slim/pydantic_ai/providers/heroku.py` ```python class HerokuProvider(Provider[AsyncOpenAI]): """Provider for Heroku API.""" @property def name(self) -> str: return 'heroku' @property def base_url(self) -> str: return str(self.client.base_url) @property def client(self) -> AsyncOpenAI: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: # As the Heroku API is OpenAI-compatible, let's assume we also need OpenAIJsonSchemaTransformer. return OpenAIModelProfile(json_schema_transformer=OpenAIJsonSchemaTransformer) @overload def __init__(self) -> None: ... @overload def __init__(self, *, api_key: str) -> None: ... @overload def __init__(self, *, api_key: str, http_client: httpx.AsyncClient) -> None: ... @overload def __init__(self, *, openai_client: AsyncOpenAI | None = None) -> None: ... def __init__( self, *, base_url: str | None = None, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: if openai_client is not None: assert http_client is None, 'Cannot provide both `openai_client` and `http_client`' assert api_key is None, 'Cannot provide both `openai_client` and `api_key`' self._client = openai_client else: api_key = api_key or os.getenv('HEROKU_INFERENCE_KEY') if not api_key: raise UserError( 'Set the `HEROKU_INFERENCE_KEY` environment variable or pass it via `HerokuProvider(api_key=...)`' 'to use the Heroku provider.' ) base_url = base_url or os.getenv('HEROKU_INFERENCE_URL', 'https://us.inference.heroku.com') base_url = base_url.rstrip('/') + '/v1' if http_client is not None: self._client = AsyncOpenAI(api_key=api_key, http_client=http_client, base_url=base_url) else: http_client = cached_async_http_client(provider='heroku') self._client = AsyncOpenAI(api_key=api_key, http_client=http_client, base_url=base_url) ``` Bases: `Provider[AsyncOpenAI]` Provider for GitHub Models API. GitHub Models provides access to various AI models through an OpenAI-compatible API. See for more information. Source code in `pydantic_ai_slim/pydantic_ai/providers/github.py` ```python class GitHubProvider(Provider[AsyncOpenAI]): """Provider for GitHub Models API. GitHub Models provides access to various AI models through an OpenAI-compatible API. See for more information. """ @property def name(self) -> str: return 'github' @property def base_url(self) -> str: return 'https://models.github.ai/inference' @property def client(self) -> AsyncOpenAI: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: provider_to_profile = { 'xai': grok_model_profile, 'meta': meta_model_profile, 'microsoft': openai_model_profile, 'mistral-ai': mistral_model_profile, 'cohere': cohere_model_profile, 'deepseek': deepseek_model_profile, } profile = None # If the model name does not contain a provider prefix, we assume it's an OpenAI model if '/' not in model_name: return openai_model_profile(model_name) provider, model_name = model_name.lower().split('/', 1) if provider in provider_to_profile: model_name, *_ = model_name.split(':', 1) # drop tags profile = provider_to_profile[provider](model_name) # As GitHubProvider is always used with OpenAIChatModel, which used to unconditionally use OpenAIJsonSchemaTransformer, # we need to maintain that behavior unless json_schema_transformer is set explicitly return OpenAIModelProfile(json_schema_transformer=OpenAIJsonSchemaTransformer).update(profile) @overload def __init__(self) -> None: ... @overload def __init__(self, *, api_key: str) -> None: ... @overload def __init__(self, *, api_key: str, http_client: httpx.AsyncClient) -> None: ... @overload def __init__(self, *, openai_client: AsyncOpenAI | None = None) -> None: ... def __init__( self, *, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: """Create a new GitHub Models provider. Args: api_key: The GitHub token to use for authentication. If not provided, the `GITHUB_API_KEY` environment variable will be used if available. openai_client: An existing `AsyncOpenAI` client to use. If provided, `api_key` and `http_client` must be `None`. http_client: An existing `httpx.AsyncClient` to use for making HTTP requests. """ api_key = api_key or os.getenv('GITHUB_API_KEY') if not api_key and openai_client is None: raise UserError( 'Set the `GITHUB_API_KEY` environment variable or pass it via `GitHubProvider(api_key=...)`' ' to use the GitHub Models provider.' ) if openai_client is not None: self._client = openai_client elif http_client is not None: self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) else: http_client = cached_async_http_client(provider='github') self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) ``` ### __init__ ```python __init__() -> None ``` ```python __init__(*, api_key: str) -> None ``` ```python __init__(*, api_key: str, http_client: AsyncClient) -> None ``` ```python __init__( *, openai_client: AsyncOpenAI | None = None ) -> None ``` ```python __init__( *, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: AsyncClient | None = None ) -> None ``` Create a new GitHub Models provider. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `api_key` | `str | None` | The GitHub token to use for authentication. If not provided, the GITHUB_API_KEY environment variable will be used if available. | `None` | | `openai_client` | `AsyncOpenAI | None` | An existing AsyncOpenAI client to use. If provided, api_key and http_client must be None. | `None` | | `http_client` | `AsyncClient | None` | An existing httpx.AsyncClient to use for making HTTP requests. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/providers/github.py` ```python def __init__( self, *, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: """Create a new GitHub Models provider. Args: api_key: The GitHub token to use for authentication. If not provided, the `GITHUB_API_KEY` environment variable will be used if available. openai_client: An existing `AsyncOpenAI` client to use. If provided, `api_key` and `http_client` must be `None`. http_client: An existing `httpx.AsyncClient` to use for making HTTP requests. """ api_key = api_key or os.getenv('GITHUB_API_KEY') if not api_key and openai_client is None: raise UserError( 'Set the `GITHUB_API_KEY` environment variable or pass it via `GitHubProvider(api_key=...)`' ' to use the GitHub Models provider.' ) if openai_client is not None: self._client = openai_client elif http_client is not None: self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) else: http_client = cached_async_http_client(provider='github') self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) ``` Bases: `Provider[AsyncOpenAI]` Provider for OpenRouter API. Source code in `pydantic_ai_slim/pydantic_ai/providers/openrouter.py` ```python class OpenRouterProvider(Provider[AsyncOpenAI]): """Provider for OpenRouter API.""" @property def name(self) -> str: return 'openrouter' @property def base_url(self) -> str: return 'https://openrouter.ai/api/v1' @property def client(self) -> AsyncOpenAI: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: provider_to_profile = { 'google': google_model_profile, 'openai': openai_model_profile, 'anthropic': anthropic_model_profile, 'mistralai': mistral_model_profile, 'qwen': qwen_model_profile, 'x-ai': grok_model_profile, 'cohere': cohere_model_profile, 'amazon': amazon_model_profile, 'deepseek': deepseek_model_profile, 'meta-llama': meta_model_profile, 'moonshotai': moonshotai_model_profile, } profile = None provider, model_name = model_name.split('/', 1) if provider in provider_to_profile: model_name, *_ = model_name.split(':', 1) # drop tags profile = provider_to_profile[provider](model_name) # As OpenRouterProvider is always used with OpenAIChatModel, which used to unconditionally use OpenAIJsonSchemaTransformer, # we need to maintain that behavior unless json_schema_transformer is set explicitly return OpenAIModelProfile(json_schema_transformer=OpenAIJsonSchemaTransformer).update(profile) @overload def __init__(self) -> None: ... @overload def __init__(self, *, api_key: str) -> None: ... @overload def __init__(self, *, api_key: str, http_client: httpx.AsyncClient) -> None: ... @overload def __init__(self, *, openai_client: AsyncOpenAI | None = None) -> None: ... def __init__( self, *, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: api_key = api_key or os.getenv('OPENROUTER_API_KEY') if not api_key and openai_client is None: raise UserError( 'Set the `OPENROUTER_API_KEY` environment variable or pass it via `OpenRouterProvider(api_key=...)`' 'to use the OpenRouter provider.' ) if openai_client is not None: self._client = openai_client elif http_client is not None: self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) else: http_client = cached_async_http_client(provider='openrouter') self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) ``` Bases: `Provider[AsyncOpenAI]` Provider for Vercel AI Gateway API. Source code in `pydantic_ai_slim/pydantic_ai/providers/vercel.py` ```python class VercelProvider(Provider[AsyncOpenAI]): """Provider for Vercel AI Gateway API.""" @property def name(self) -> str: return 'vercel' @property def base_url(self) -> str: return 'https://ai-gateway.vercel.sh/v1' @property def client(self) -> AsyncOpenAI: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: provider_to_profile = { 'anthropic': anthropic_model_profile, 'bedrock': amazon_model_profile, 'cohere': cohere_model_profile, 'deepseek': deepseek_model_profile, 'mistral': mistral_model_profile, 'openai': openai_model_profile, 'vertex': google_model_profile, 'xai': grok_model_profile, } profile = None try: provider, model_name = model_name.split('/', 1) except ValueError: raise UserError(f"Model name must be in 'provider/model' format, got: {model_name!r}") if provider in provider_to_profile: profile = provider_to_profile[provider](model_name) # As VercelProvider is always used with OpenAIChatModel, which used to unconditionally use OpenAIJsonSchemaTransformer, # we need to maintain that behavior unless json_schema_transformer is set explicitly return OpenAIModelProfile( json_schema_transformer=OpenAIJsonSchemaTransformer, ).update(profile) @overload def __init__(self) -> None: ... @overload def __init__(self, *, api_key: str) -> None: ... @overload def __init__(self, *, api_key: str, http_client: httpx.AsyncClient) -> None: ... @overload def __init__(self, *, openai_client: AsyncOpenAI | None = None) -> None: ... def __init__( self, *, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: # Support Vercel AI Gateway's standard environment variables api_key = api_key or os.getenv('VERCEL_AI_GATEWAY_API_KEY') or os.getenv('VERCEL_OIDC_TOKEN') if not api_key and openai_client is None: raise UserError( 'Set the `VERCEL_AI_GATEWAY_API_KEY` or `VERCEL_OIDC_TOKEN` environment variable ' 'or pass the API key via `VercelProvider(api_key=...)` to use the Vercel provider.' ) default_headers = {'http-referer': 'https://ai.pydantic.dev/', 'x-title': 'pydantic-ai'} if openai_client is not None: self._client = openai_client elif http_client is not None: self._client = AsyncOpenAI( base_url=self.base_url, api_key=api_key, http_client=http_client, default_headers=default_headers ) else: http_client = cached_async_http_client(provider='vercel') self._client = AsyncOpenAI( base_url=self.base_url, api_key=api_key, http_client=http_client, default_headers=default_headers ) ``` Bases: `Provider[AsyncInferenceClient]` Provider for Hugging Face. Source code in `pydantic_ai_slim/pydantic_ai/providers/huggingface.py` ```python class HuggingFaceProvider(Provider[AsyncInferenceClient]): """Provider for Hugging Face.""" @property def name(self) -> str: return 'huggingface' @property def base_url(self) -> str: return self.client.model # type: ignore @property def client(self) -> AsyncInferenceClient: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: provider_to_profile = { 'deepseek-ai': deepseek_model_profile, 'google': google_model_profile, 'qwen': qwen_model_profile, 'meta-llama': meta_model_profile, 'mistralai': mistral_model_profile, 'moonshotai': moonshotai_model_profile, } if '/' not in model_name: return None model_name = model_name.lower() provider, model_name = model_name.split('/', 1) if provider in provider_to_profile: return provider_to_profile[provider](model_name) return None @overload def __init__(self, *, base_url: str, api_key: str | None = None) -> None: ... @overload def __init__(self, *, provider_name: str, api_key: str | None = None) -> None: ... @overload def __init__(self, *, hf_client: AsyncInferenceClient, api_key: str | None = None) -> None: ... @overload def __init__(self, *, hf_client: AsyncInferenceClient, base_url: str, api_key: str | None = None) -> None: ... @overload def __init__(self, *, hf_client: AsyncInferenceClient, provider_name: str, api_key: str | None = None) -> None: ... @overload def __init__(self, *, api_key: str | None = None) -> None: ... def __init__( self, base_url: str | None = None, api_key: str | None = None, hf_client: AsyncInferenceClient | None = None, http_client: AsyncClient | None = None, provider_name: str | None = None, ) -> None: """Create a new Hugging Face provider. Args: base_url: The base url for the Hugging Face requests. api_key: The API key to use for authentication, if not provided, the `HF_TOKEN` environment variable will be used if available. hf_client: An existing [`AsyncInferenceClient`](https://huggingface.co/docs/huggingface_hub/v0.29.3/en/package_reference/inference_client#huggingface_hub.AsyncInferenceClient) client to use. If not provided, a new instance will be created. http_client: (currently ignored) An existing `httpx.AsyncClient` to use for making HTTP requests. provider_name : Name of the provider to use for inference. available providers can be found in the [HF Inference Providers documentation](https://huggingface.co/docs/inference-providers/index#partners). defaults to "auto", which will select the first available provider for the model, the first of the providers available for the model, sorted by the user's order in https://hf.co/settings/inference-providers. If `base_url` is passed, then `provider_name` is not used. """ api_key = api_key or os.getenv('HF_TOKEN') if api_key is None: raise UserError( 'Set the `HF_TOKEN` environment variable or pass it via `HuggingFaceProvider(api_key=...)`' 'to use the HuggingFace provider.' ) if http_client is not None: raise ValueError('`http_client` is ignored for HuggingFace provider, please use `hf_client` instead.') if base_url is not None and provider_name is not None: raise ValueError('Cannot provide both `base_url` and `provider_name`.') if hf_client is None: self._client = AsyncInferenceClient(api_key=api_key, provider=provider_name, base_url=base_url) # type: ignore else: self._client = hf_client ``` ### __init__ ```python __init__( *, base_url: str, api_key: str | None = None ) -> None ``` ```python __init__( *, provider_name: str, api_key: str | None = None ) -> None ``` ```python __init__( *, hf_client: AsyncInferenceClient, api_key: str | None = None ) -> None ``` ```python __init__( *, hf_client: AsyncInferenceClient, base_url: str, api_key: str | None = None ) -> None ``` ```python __init__( *, hf_client: AsyncInferenceClient, provider_name: str, api_key: str | None = None ) -> None ``` ```python __init__(*, api_key: str | None = None) -> None ``` ```python __init__( base_url: str | None = None, api_key: str | None = None, hf_client: AsyncInferenceClient | None = None, http_client: AsyncClient | None = None, provider_name: str | None = None, ) -> None ``` Create a new Hugging Face provider. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `base_url` | `str | None` | The base url for the Hugging Face requests. | `None` | | `api_key` | `str | None` | The API key to use for authentication, if not provided, the HF_TOKEN environment variable will be used if available. | `None` | | `hf_client` | `AsyncInferenceClient | None` | An existing AsyncInferenceClient client to use. If not provided, a new instance will be created. | `None` | | `http_client` | `AsyncClient | None` | (currently ignored) An existing httpx.AsyncClient to use for making HTTP requests. | `None` | | `provider_name` | | Name of the provider to use for inference. available providers can be found in the HF Inference Providers documentation. defaults to "auto", which will select the first available provider for the model, the first of the providers available for the model, sorted by the user's order in https://hf.co/settings/inference-providers. If base_url is passed, then provider_name is not used. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/providers/huggingface.py` ```python def __init__( self, base_url: str | None = None, api_key: str | None = None, hf_client: AsyncInferenceClient | None = None, http_client: AsyncClient | None = None, provider_name: str | None = None, ) -> None: """Create a new Hugging Face provider. Args: base_url: The base url for the Hugging Face requests. api_key: The API key to use for authentication, if not provided, the `HF_TOKEN` environment variable will be used if available. hf_client: An existing [`AsyncInferenceClient`](https://huggingface.co/docs/huggingface_hub/v0.29.3/en/package_reference/inference_client#huggingface_hub.AsyncInferenceClient) client to use. If not provided, a new instance will be created. http_client: (currently ignored) An existing `httpx.AsyncClient` to use for making HTTP requests. provider_name : Name of the provider to use for inference. available providers can be found in the [HF Inference Providers documentation](https://huggingface.co/docs/inference-providers/index#partners). defaults to "auto", which will select the first available provider for the model, the first of the providers available for the model, sorted by the user's order in https://hf.co/settings/inference-providers. If `base_url` is passed, then `provider_name` is not used. """ api_key = api_key or os.getenv('HF_TOKEN') if api_key is None: raise UserError( 'Set the `HF_TOKEN` environment variable or pass it via `HuggingFaceProvider(api_key=...)`' 'to use the HuggingFace provider.' ) if http_client is not None: raise ValueError('`http_client` is ignored for HuggingFace provider, please use `hf_client` instead.') if base_url is not None and provider_name is not None: raise ValueError('Cannot provide both `base_url` and `provider_name`.') if hf_client is None: self._client = AsyncInferenceClient(api_key=api_key, provider=provider_name, base_url=base_url) # type: ignore else: self._client = hf_client ``` Bases: `Provider[AsyncOpenAI]` Provider for MoonshotAI platform (Kimi models). Source code in `pydantic_ai_slim/pydantic_ai/providers/moonshotai.py` ```python class MoonshotAIProvider(Provider[AsyncOpenAI]): """Provider for MoonshotAI platform (Kimi models).""" @property def name(self) -> str: return 'moonshotai' @property def base_url(self) -> str: # OpenAI-compatible endpoint, see MoonshotAI docs return 'https://api.moonshot.ai/v1' @property def client(self) -> AsyncOpenAI: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: profile = moonshotai_model_profile(model_name) # As the MoonshotAI API is OpenAI-compatible, let's assume we also need OpenAIJsonSchemaTransformer, # unless json_schema_transformer is set explicitly. # Also, MoonshotAI does not support strict tool definitions # https://platform.moonshot.ai/docs/guide/migrating-from-openai-to-kimi#about-tool_choice # "Please note that the current version of Kimi API does not support the tool_choice=required parameter." return OpenAIModelProfile( json_schema_transformer=OpenAIJsonSchemaTransformer, openai_supports_tool_choice_required=False, supports_json_object_output=True, ).update(profile) @overload def __init__(self) -> None: ... @overload def __init__(self, *, api_key: str) -> None: ... @overload def __init__(self, *, api_key: str, http_client: httpx.AsyncClient) -> None: ... @overload def __init__(self, *, openai_client: AsyncOpenAI | None = None) -> None: ... def __init__( self, *, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: api_key = api_key or os.getenv('MOONSHOTAI_API_KEY') if not api_key and openai_client is None: raise UserError( 'Set the `MOONSHOTAI_API_KEY` environment variable or pass it via ' '`MoonshotAIProvider(api_key=...)` to use the MoonshotAI provider.' ) if openai_client is not None: self._client = openai_client elif http_client is not None: self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) else: http_client = cached_async_http_client(provider='moonshotai') self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) ``` Bases: `Provider[AsyncOpenAI]` Provider for local or remote Ollama API. Source code in `pydantic_ai_slim/pydantic_ai/providers/ollama.py` ```python class OllamaProvider(Provider[AsyncOpenAI]): """Provider for local or remote Ollama API.""" @property def name(self) -> str: return 'ollama' @property def base_url(self) -> str: return str(self.client.base_url) @property def client(self) -> AsyncOpenAI: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: prefix_to_profile = { 'llama': meta_model_profile, 'gemma': google_model_profile, 'qwen': qwen_model_profile, 'qwq': qwen_model_profile, 'deepseek': deepseek_model_profile, 'mistral': mistral_model_profile, 'command': cohere_model_profile, } profile = None for prefix, profile_func in prefix_to_profile.items(): model_name = model_name.lower() if model_name.startswith(prefix): profile = profile_func(model_name) # As OllamaProvider is always used with OpenAIChatModel, which used to unconditionally use OpenAIJsonSchemaTransformer, # we need to maintain that behavior unless json_schema_transformer is set explicitly return OpenAIModelProfile(json_schema_transformer=OpenAIJsonSchemaTransformer).update(profile) def __init__( self, base_url: str | None = None, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: """Create a new Ollama provider. Args: base_url: The base url for the Ollama requests. If not provided, the `OLLAMA_BASE_URL` environment variable will be used if available. api_key: The API key to use for authentication, if not provided, the `OLLAMA_API_KEY` environment variable will be used if available. openai_client: An existing [`AsyncOpenAI`](https://github.com/openai/openai-python?tab=readme-ov-file#async-usage) client to use. If provided, `base_url`, `api_key`, and `http_client` must be `None`. http_client: An existing `httpx.AsyncClient` to use for making HTTP requests. """ if openai_client is not None: assert base_url is None, 'Cannot provide both `openai_client` and `base_url`' assert http_client is None, 'Cannot provide both `openai_client` and `http_client`' assert api_key is None, 'Cannot provide both `openai_client` and `api_key`' self._client = openai_client else: base_url = base_url or os.getenv('OLLAMA_BASE_URL') if not base_url: raise UserError( 'Set the `OLLAMA_BASE_URL` environment variable or pass it via `OllamaProvider(base_url=...)`' 'to use the Ollama provider.' ) # This is a workaround for the OpenAI client requiring an API key, whilst locally served, # openai compatible models do not always need an API key, but a placeholder (non-empty) key is required. api_key = api_key or os.getenv('OLLAMA_API_KEY') or 'api-key-not-set' if http_client is not None: self._client = AsyncOpenAI(base_url=base_url, api_key=api_key, http_client=http_client) else: http_client = cached_async_http_client(provider='ollama') self._client = AsyncOpenAI(base_url=base_url, api_key=api_key, http_client=http_client) ``` ### __init__ ```python __init__( base_url: str | None = None, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: AsyncClient | None = None, ) -> None ``` Create a new Ollama provider. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `base_url` | `str | None` | The base url for the Ollama requests. If not provided, the OLLAMA_BASE_URL environment variable will be used if available. | `None` | | `api_key` | `str | None` | The API key to use for authentication, if not provided, the OLLAMA_API_KEY environment variable will be used if available. | `None` | | `openai_client` | `AsyncOpenAI | None` | An existing AsyncOpenAI client to use. If provided, base_url, api_key, and http_client must be None. | `None` | | `http_client` | `AsyncClient | None` | An existing httpx.AsyncClient to use for making HTTP requests. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/providers/ollama.py` ```python def __init__( self, base_url: str | None = None, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: """Create a new Ollama provider. Args: base_url: The base url for the Ollama requests. If not provided, the `OLLAMA_BASE_URL` environment variable will be used if available. api_key: The API key to use for authentication, if not provided, the `OLLAMA_API_KEY` environment variable will be used if available. openai_client: An existing [`AsyncOpenAI`](https://github.com/openai/openai-python?tab=readme-ov-file#async-usage) client to use. If provided, `base_url`, `api_key`, and `http_client` must be `None`. http_client: An existing `httpx.AsyncClient` to use for making HTTP requests. """ if openai_client is not None: assert base_url is None, 'Cannot provide both `openai_client` and `base_url`' assert http_client is None, 'Cannot provide both `openai_client` and `http_client`' assert api_key is None, 'Cannot provide both `openai_client` and `api_key`' self._client = openai_client else: base_url = base_url or os.getenv('OLLAMA_BASE_URL') if not base_url: raise UserError( 'Set the `OLLAMA_BASE_URL` environment variable or pass it via `OllamaProvider(base_url=...)`' 'to use the Ollama provider.' ) # This is a workaround for the OpenAI client requiring an API key, whilst locally served, # openai compatible models do not always need an API key, but a placeholder (non-empty) key is required. api_key = api_key or os.getenv('OLLAMA_API_KEY') or 'api-key-not-set' if http_client is not None: self._client = AsyncOpenAI(base_url=base_url, api_key=api_key, http_client=http_client) else: http_client = cached_async_http_client(provider='ollama') self._client = AsyncOpenAI(base_url=base_url, api_key=api_key, http_client=http_client) ``` Bases: `Provider[AsyncOpenAI]` Provider for LiteLLM API. Source code in `pydantic_ai_slim/pydantic_ai/providers/litellm.py` ```python class LiteLLMProvider(Provider[AsyncOpenAI]): """Provider for LiteLLM API.""" @property def name(self) -> str: return 'litellm' @property def base_url(self) -> str: return str(self.client.base_url) @property def client(self) -> AsyncOpenAI: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: # Map provider prefixes to their profile functions provider_to_profile = { 'anthropic': anthropic_model_profile, 'openai': openai_model_profile, 'google': google_model_profile, 'mistralai': mistral_model_profile, 'mistral': mistral_model_profile, 'cohere': cohere_model_profile, 'amazon': amazon_model_profile, 'bedrock': amazon_model_profile, 'meta-llama': meta_model_profile, 'meta': meta_model_profile, 'groq': groq_model_profile, 'deepseek': deepseek_model_profile, 'moonshotai': moonshotai_model_profile, 'x-ai': grok_model_profile, 'qwen': qwen_model_profile, } profile = None # Check if model name contains a provider prefix (e.g., "anthropic/claude-3") if '/' in model_name: provider_prefix, model_suffix = model_name.split('/', 1) if provider_prefix in provider_to_profile: profile = provider_to_profile[provider_prefix](model_suffix) # If no profile found, default to OpenAI profile if profile is None: profile = openai_model_profile(model_name) # As LiteLLMProvider is used with OpenAIModel, which uses OpenAIJsonSchemaTransformer, # we maintain that behavior return OpenAIModelProfile(json_schema_transformer=OpenAIJsonSchemaTransformer).update(profile) @overload def __init__( self, *, api_key: str | None = None, api_base: str | None = None, ) -> None: ... @overload def __init__( self, *, api_key: str | None = None, api_base: str | None = None, http_client: AsyncHTTPClient, ) -> None: ... @overload def __init__(self, *, openai_client: AsyncOpenAI) -> None: ... def __init__( self, *, api_key: str | None = None, api_base: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: AsyncHTTPClient | None = None, ) -> None: """Initialize a LiteLLM provider. Args: api_key: API key for the model provider. If None, LiteLLM will try to get it from environment variables. api_base: Base URL for the model provider. Use this for custom endpoints or self-hosted models. openai_client: Pre-configured OpenAI client. If provided, other parameters are ignored. http_client: Custom HTTP client to use. """ if openai_client is not None: self._client = openai_client return # Create OpenAI client that will be used with LiteLLM's completion function # The actual API calls will be intercepted and routed through LiteLLM if http_client is not None: self._client = AsyncOpenAI( base_url=api_base, api_key=api_key or 'litellm-placeholder', http_client=http_client ) else: http_client = cached_async_http_client(provider='litellm') self._client = AsyncOpenAI( base_url=api_base, api_key=api_key or 'litellm-placeholder', http_client=http_client ) ``` ### __init__ ```python __init__( *, api_key: str | None = None, api_base: str | None = None ) -> None ``` ```python __init__( *, api_key: str | None = None, api_base: str | None = None, http_client: AsyncClient ) -> None ``` ```python __init__(*, openai_client: AsyncOpenAI) -> None ``` ```python __init__( *, api_key: str | None = None, api_base: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: AsyncClient | None = None ) -> None ``` Initialize a LiteLLM provider. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `api_key` | `str | None` | API key for the model provider. If None, LiteLLM will try to get it from environment variables. | `None` | | `api_base` | `str | None` | Base URL for the model provider. Use this for custom endpoints or self-hosted models. | `None` | | `openai_client` | `AsyncOpenAI | None` | Pre-configured OpenAI client. If provided, other parameters are ignored. | `None` | | `http_client` | `AsyncClient | None` | Custom HTTP client to use. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/providers/litellm.py` ```python def __init__( self, *, api_key: str | None = None, api_base: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: AsyncHTTPClient | None = None, ) -> None: """Initialize a LiteLLM provider. Args: api_key: API key for the model provider. If None, LiteLLM will try to get it from environment variables. api_base: Base URL for the model provider. Use this for custom endpoints or self-hosted models. openai_client: Pre-configured OpenAI client. If provided, other parameters are ignored. http_client: Custom HTTP client to use. """ if openai_client is not None: self._client = openai_client return # Create OpenAI client that will be used with LiteLLM's completion function # The actual API calls will be intercepted and routed through LiteLLM if http_client is not None: self._client = AsyncOpenAI( base_url=api_base, api_key=api_key or 'litellm-placeholder', http_client=http_client ) else: http_client = cached_async_http_client(provider='litellm') self._client = AsyncOpenAI( base_url=api_base, api_key=api_key or 'litellm-placeholder', http_client=http_client ) ``` # `pydantic_ai.result` ### StreamedRunResult Bases: `Generic[AgentDepsT, OutputDataT]` Result of a streamed run that returns structured data via a tool call. Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python @dataclass(init=False) class StreamedRunResult(Generic[AgentDepsT, OutputDataT]): """Result of a streamed run that returns structured data via a tool call.""" _all_messages: list[_messages.ModelMessage] _new_message_index: int _stream_response: AgentStream[AgentDepsT, OutputDataT] | None = None _on_complete: Callable[[], Awaitable[None]] | None = None _run_result: AgentRunResult[OutputDataT] | None = None is_complete: bool = field(default=False, init=False) """Whether the stream has all been received. This is set to `True` when one of [`stream_output`][pydantic_ai.result.StreamedRunResult.stream_output], [`stream_text`][pydantic_ai.result.StreamedRunResult.stream_text], [`stream_responses`][pydantic_ai.result.StreamedRunResult.stream_responses] or [`get_output`][pydantic_ai.result.StreamedRunResult.get_output] completes. """ @overload def __init__( self, all_messages: list[_messages.ModelMessage], new_message_index: int, stream_response: AgentStream[AgentDepsT, OutputDataT] | None, on_complete: Callable[[], Awaitable[None]] | None, ) -> None: ... @overload def __init__( self, all_messages: list[_messages.ModelMessage], new_message_index: int, *, run_result: AgentRunResult[OutputDataT], ) -> None: ... def __init__( self, all_messages: list[_messages.ModelMessage], new_message_index: int, stream_response: AgentStream[AgentDepsT, OutputDataT] | None = None, on_complete: Callable[[], Awaitable[None]] | None = None, run_result: AgentRunResult[OutputDataT] | None = None, ) -> None: self._all_messages = all_messages self._new_message_index = new_message_index self._stream_response = stream_response self._on_complete = on_complete self._run_result = run_result def all_messages(self, *, output_tool_return_content: str | None = None) -> list[_messages.ModelMessage]: """Return the history of _messages. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: List of messages. """ # this is a method to be consistent with the other methods if output_tool_return_content is not None: raise NotImplementedError('Setting output tool return content is not supported for this result type.') return self._all_messages def all_messages_json(self, *, output_tool_return_content: str | None = None) -> bytes: # pragma: no cover """Return all messages from [`all_messages`][pydantic_ai.result.StreamedRunResult.all_messages] as JSON bytes. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: JSON bytes representing the messages. """ return _messages.ModelMessagesTypeAdapter.dump_json( self.all_messages(output_tool_return_content=output_tool_return_content) ) def new_messages( self, *, output_tool_return_content: str | None = None ) -> list[_messages.ModelMessage]: # pragma: no cover """Return new messages associated with this run. Messages from older runs are excluded. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: List of new messages. """ return self.all_messages(output_tool_return_content=output_tool_return_content)[self._new_message_index :] def new_messages_json(self, *, output_tool_return_content: str | None = None) -> bytes: # pragma: no cover """Return new messages from [`new_messages`][pydantic_ai.result.StreamedRunResult.new_messages] as JSON bytes. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: JSON bytes representing the new messages. """ return _messages.ModelMessagesTypeAdapter.dump_json( self.new_messages(output_tool_return_content=output_tool_return_content) ) @deprecated('`StreamedRunResult.stream` is deprecated, use `stream_output` instead.') async def stream(self, *, debounce_by: float | None = 0.1) -> AsyncIterator[OutputDataT]: async for output in self.stream_output(debounce_by=debounce_by): yield output async def stream_output(self, *, debounce_by: float | None = 0.1) -> AsyncIterator[OutputDataT]: """Stream the output as an async iterable. The pydantic validator for structured data will be called in [partial mode](https://docs.pydantic.dev/dev/concepts/experimental/#partial-validation) on each iteration. Args: debounce_by: by how much (if at all) to debounce/group the output chunks by. `None` means no debouncing. Debouncing is particularly important for long structured outputs to reduce the overhead of performing validation as each token is received. Returns: An async iterable of the response data. """ if self._run_result is not None: yield self._run_result.output await self._marked_completed() elif self._stream_response is not None: async for output in self._stream_response.stream_output(debounce_by=debounce_by): yield output await self._marked_completed(self._stream_response.get()) else: raise ValueError('No stream response or run result provided') # pragma: no cover async def stream_text(self, *, delta: bool = False, debounce_by: float | None = 0.1) -> AsyncIterator[str]: """Stream the text result as an async iterable. !!! note Result validators will NOT be called on the text result if `delta=True`. Args: delta: if `True`, yield each chunk of text as it is received, if `False` (default), yield the full text up to the current point. debounce_by: by how much (if at all) to debounce/group the response chunks by. `None` means no debouncing. Debouncing is particularly important for long structured responses to reduce the overhead of performing validation as each token is received. """ if self._run_result is not None: # pragma: no cover # We can't really get here, as `_run_result` is only set in `run_stream` when `CallToolsNode` produces `DeferredToolRequests` output # as a result of a tool function raising `CallDeferred` or `ApprovalRequired`. # That'll change if we ever support something like `raise EndRun(output: OutputT)` where `OutputT` could be `str`. if not isinstance(self._run_result.output, str): raise exceptions.UserError('stream_text() can only be used with text responses') yield self._run_result.output await self._marked_completed() elif self._stream_response is not None: async for text in self._stream_response.stream_text(delta=delta, debounce_by=debounce_by): yield text await self._marked_completed(self._stream_response.get()) else: raise ValueError('No stream response or run result provided') # pragma: no cover @deprecated('`StreamedRunResult.stream_structured` is deprecated, use `stream_responses` instead.') async def stream_structured( self, *, debounce_by: float | None = 0.1 ) -> AsyncIterator[tuple[_messages.ModelResponse, bool]]: async for msg, last in self.stream_responses(debounce_by=debounce_by): yield msg, last async def stream_responses( self, *, debounce_by: float | None = 0.1 ) -> AsyncIterator[tuple[_messages.ModelResponse, bool]]: """Stream the response as an async iterable of Structured LLM Messages. Args: debounce_by: by how much (if at all) to debounce/group the response chunks by. `None` means no debouncing. Debouncing is particularly important for long structured responses to reduce the overhead of performing validation as each token is received. Returns: An async iterable of the structured response message and whether that is the last message. """ if self._run_result is not None: model_response = cast(_messages.ModelResponse, self.all_messages()[-1]) yield model_response, True await self._marked_completed() elif self._stream_response is not None: # if the message currently has any parts with content, yield before streaming async for msg in self._stream_response.stream_responses(debounce_by=debounce_by): yield msg, False msg = self._stream_response.get() yield msg, True await self._marked_completed(msg) else: raise ValueError('No stream response or run result provided') # pragma: no cover async def get_output(self) -> OutputDataT: """Stream the whole response, validate and return it.""" if self._run_result is not None: output = self._run_result.output await self._marked_completed() return output elif self._stream_response is not None: output = await self._stream_response.get_output() await self._marked_completed(self._stream_response.get()) return output else: raise ValueError('No stream response or run result provided') # pragma: no cover def usage(self) -> RunUsage: """Return the usage of the whole run. !!! note This won't return the full usage until the stream is finished. """ if self._run_result is not None: return self._run_result.usage() elif self._stream_response is not None: return self._stream_response.usage() else: raise ValueError('No stream response or run result provided') # pragma: no cover def timestamp(self) -> datetime: """Get the timestamp of the response.""" if self._run_result is not None: return self._run_result.timestamp() elif self._stream_response is not None: return self._stream_response.timestamp() else: raise ValueError('No stream response or run result provided') # pragma: no cover @deprecated('`validate_structured_output` is deprecated, use `validate_response_output` instead.') async def validate_structured_output( self, message: _messages.ModelResponse, *, allow_partial: bool = False ) -> OutputDataT: return await self.validate_response_output(message, allow_partial=allow_partial) async def validate_response_output( self, message: _messages.ModelResponse, *, allow_partial: bool = False ) -> OutputDataT: """Validate a structured result message.""" if self._run_result is not None: return self._run_result.output elif self._stream_response is not None: return await self._stream_response.validate_response_output(message, allow_partial=allow_partial) else: raise ValueError('No stream response or run result provided') # pragma: no cover async def _marked_completed(self, message: _messages.ModelResponse | None = None) -> None: self.is_complete = True if message is not None: self._all_messages.append(message) if self._on_complete is not None: await self._on_complete() ``` #### is_complete ```python is_complete: bool = field(default=False, init=False) ``` Whether the stream has all been received. This is set to `True` when one of stream_output, stream_text, stream_responses or get_output completes. #### all_messages ```python all_messages( *, output_tool_return_content: str | None = None ) -> list[ModelMessage] ``` Return the history of \_messages. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `output_tool_return_content` | `str | None` | The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If None, the last message will not be modified. | `None` | Returns: | Type | Description | | --- | --- | | `list[ModelMessage]` | List of messages. | Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python def all_messages(self, *, output_tool_return_content: str | None = None) -> list[_messages.ModelMessage]: """Return the history of _messages. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: List of messages. """ # this is a method to be consistent with the other methods if output_tool_return_content is not None: raise NotImplementedError('Setting output tool return content is not supported for this result type.') return self._all_messages ``` #### all_messages_json ```python all_messages_json( *, output_tool_return_content: str | None = None ) -> bytes ``` Return all messages from all_messages as JSON bytes. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `output_tool_return_content` | `str | None` | The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If None, the last message will not be modified. | `None` | Returns: | Type | Description | | --- | --- | | `bytes` | JSON bytes representing the messages. | Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python def all_messages_json(self, *, output_tool_return_content: str | None = None) -> bytes: # pragma: no cover """Return all messages from [`all_messages`][pydantic_ai.result.StreamedRunResult.all_messages] as JSON bytes. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: JSON bytes representing the messages. """ return _messages.ModelMessagesTypeAdapter.dump_json( self.all_messages(output_tool_return_content=output_tool_return_content) ) ``` #### new_messages ```python new_messages( *, output_tool_return_content: str | None = None ) -> list[ModelMessage] ``` Return new messages associated with this run. Messages from older runs are excluded. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `output_tool_return_content` | `str | None` | The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If None, the last message will not be modified. | `None` | Returns: | Type | Description | | --- | --- | | `list[ModelMessage]` | List of new messages. | Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python def new_messages( self, *, output_tool_return_content: str | None = None ) -> list[_messages.ModelMessage]: # pragma: no cover """Return new messages associated with this run. Messages from older runs are excluded. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: List of new messages. """ return self.all_messages(output_tool_return_content=output_tool_return_content)[self._new_message_index :] ``` #### new_messages_json ```python new_messages_json( *, output_tool_return_content: str | None = None ) -> bytes ``` Return new messages from new_messages as JSON bytes. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `output_tool_return_content` | `str | None` | The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If None, the last message will not be modified. | `None` | Returns: | Type | Description | | --- | --- | | `bytes` | JSON bytes representing the new messages. | Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python def new_messages_json(self, *, output_tool_return_content: str | None = None) -> bytes: # pragma: no cover """Return new messages from [`new_messages`][pydantic_ai.result.StreamedRunResult.new_messages] as JSON bytes. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: JSON bytes representing the new messages. """ return _messages.ModelMessagesTypeAdapter.dump_json( self.new_messages(output_tool_return_content=output_tool_return_content) ) ``` #### stream ```python stream( *, debounce_by: float | None = 0.1 ) -> AsyncIterator[OutputDataT] ``` Deprecated `StreamedRunResult.stream` is deprecated, use `stream_output` instead. Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python @deprecated('`StreamedRunResult.stream` is deprecated, use `stream_output` instead.') async def stream(self, *, debounce_by: float | None = 0.1) -> AsyncIterator[OutputDataT]: async for output in self.stream_output(debounce_by=debounce_by): yield output ``` #### stream_output ```python stream_output( *, debounce_by: float | None = 0.1 ) -> AsyncIterator[OutputDataT] ``` Stream the output as an async iterable. The pydantic validator for structured data will be called in [partial mode](https://docs.pydantic.dev/dev/concepts/experimental/#partial-validation) on each iteration. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `debounce_by` | `float | None` | by how much (if at all) to debounce/group the output chunks by. None means no debouncing. Debouncing is particularly important for long structured outputs to reduce the overhead of performing validation as each token is received. | `0.1` | Returns: | Type | Description | | --- | --- | | `AsyncIterator[OutputDataT]` | An async iterable of the response data. | Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python async def stream_output(self, *, debounce_by: float | None = 0.1) -> AsyncIterator[OutputDataT]: """Stream the output as an async iterable. The pydantic validator for structured data will be called in [partial mode](https://docs.pydantic.dev/dev/concepts/experimental/#partial-validation) on each iteration. Args: debounce_by: by how much (if at all) to debounce/group the output chunks by. `None` means no debouncing. Debouncing is particularly important for long structured outputs to reduce the overhead of performing validation as each token is received. Returns: An async iterable of the response data. """ if self._run_result is not None: yield self._run_result.output await self._marked_completed() elif self._stream_response is not None: async for output in self._stream_response.stream_output(debounce_by=debounce_by): yield output await self._marked_completed(self._stream_response.get()) else: raise ValueError('No stream response or run result provided') # pragma: no cover ``` #### stream_text ```python stream_text( *, delta: bool = False, debounce_by: float | None = 0.1 ) -> AsyncIterator[str] ``` Stream the text result as an async iterable. Note Result validators will NOT be called on the text result if `delta=True`. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `delta` | `bool` | if True, yield each chunk of text as it is received, if False (default), yield the full text up to the current point. | `False` | | `debounce_by` | `float | None` | by how much (if at all) to debounce/group the response chunks by. None means no debouncing. Debouncing is particularly important for long structured responses to reduce the overhead of performing validation as each token is received. | `0.1` | Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python async def stream_text(self, *, delta: bool = False, debounce_by: float | None = 0.1) -> AsyncIterator[str]: """Stream the text result as an async iterable. !!! note Result validators will NOT be called on the text result if `delta=True`. Args: delta: if `True`, yield each chunk of text as it is received, if `False` (default), yield the full text up to the current point. debounce_by: by how much (if at all) to debounce/group the response chunks by. `None` means no debouncing. Debouncing is particularly important for long structured responses to reduce the overhead of performing validation as each token is received. """ if self._run_result is not None: # pragma: no cover # We can't really get here, as `_run_result` is only set in `run_stream` when `CallToolsNode` produces `DeferredToolRequests` output # as a result of a tool function raising `CallDeferred` or `ApprovalRequired`. # That'll change if we ever support something like `raise EndRun(output: OutputT)` where `OutputT` could be `str`. if not isinstance(self._run_result.output, str): raise exceptions.UserError('stream_text() can only be used with text responses') yield self._run_result.output await self._marked_completed() elif self._stream_response is not None: async for text in self._stream_response.stream_text(delta=delta, debounce_by=debounce_by): yield text await self._marked_completed(self._stream_response.get()) else: raise ValueError('No stream response or run result provided') # pragma: no cover ``` #### stream_structured ```python stream_structured( *, debounce_by: float | None = 0.1 ) -> AsyncIterator[tuple[ModelResponse, bool]] ``` Deprecated `StreamedRunResult.stream_structured` is deprecated, use `stream_responses` instead. Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python @deprecated('`StreamedRunResult.stream_structured` is deprecated, use `stream_responses` instead.') async def stream_structured( self, *, debounce_by: float | None = 0.1 ) -> AsyncIterator[tuple[_messages.ModelResponse, bool]]: async for msg, last in self.stream_responses(debounce_by=debounce_by): yield msg, last ``` #### stream_responses ```python stream_responses( *, debounce_by: float | None = 0.1 ) -> AsyncIterator[tuple[ModelResponse, bool]] ``` Stream the response as an async iterable of Structured LLM Messages. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `debounce_by` | `float | None` | by how much (if at all) to debounce/group the response chunks by. None means no debouncing. Debouncing is particularly important for long structured responses to reduce the overhead of performing validation as each token is received. | `0.1` | Returns: | Type | Description | | --- | --- | | `AsyncIterator[tuple[ModelResponse, bool]]` | An async iterable of the structured response message and whether that is the last message. | Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python async def stream_responses( self, *, debounce_by: float | None = 0.1 ) -> AsyncIterator[tuple[_messages.ModelResponse, bool]]: """Stream the response as an async iterable of Structured LLM Messages. Args: debounce_by: by how much (if at all) to debounce/group the response chunks by. `None` means no debouncing. Debouncing is particularly important for long structured responses to reduce the overhead of performing validation as each token is received. Returns: An async iterable of the structured response message and whether that is the last message. """ if self._run_result is not None: model_response = cast(_messages.ModelResponse, self.all_messages()[-1]) yield model_response, True await self._marked_completed() elif self._stream_response is not None: # if the message currently has any parts with content, yield before streaming async for msg in self._stream_response.stream_responses(debounce_by=debounce_by): yield msg, False msg = self._stream_response.get() yield msg, True await self._marked_completed(msg) else: raise ValueError('No stream response or run result provided') # pragma: no cover ``` #### get_output ```python get_output() -> OutputDataT ``` Stream the whole response, validate and return it. Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python async def get_output(self) -> OutputDataT: """Stream the whole response, validate and return it.""" if self._run_result is not None: output = self._run_result.output await self._marked_completed() return output elif self._stream_response is not None: output = await self._stream_response.get_output() await self._marked_completed(self._stream_response.get()) return output else: raise ValueError('No stream response or run result provided') # pragma: no cover ``` #### usage ```python usage() -> RunUsage ``` Return the usage of the whole run. Note This won't return the full usage until the stream is finished. Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python def usage(self) -> RunUsage: """Return the usage of the whole run. !!! note This won't return the full usage until the stream is finished. """ if self._run_result is not None: return self._run_result.usage() elif self._stream_response is not None: return self._stream_response.usage() else: raise ValueError('No stream response or run result provided') # pragma: no cover ``` #### timestamp ```python timestamp() -> datetime ``` Get the timestamp of the response. Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python def timestamp(self) -> datetime: """Get the timestamp of the response.""" if self._run_result is not None: return self._run_result.timestamp() elif self._stream_response is not None: return self._stream_response.timestamp() else: raise ValueError('No stream response or run result provided') # pragma: no cover ``` #### validate_structured_output ```python validate_structured_output( message: ModelResponse, *, allow_partial: bool = False ) -> OutputDataT ``` Deprecated `validate_structured_output` is deprecated, use `validate_response_output` instead. Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python @deprecated('`validate_structured_output` is deprecated, use `validate_response_output` instead.') async def validate_structured_output( self, message: _messages.ModelResponse, *, allow_partial: bool = False ) -> OutputDataT: return await self.validate_response_output(message, allow_partial=allow_partial) ``` #### validate_response_output ```python validate_response_output( message: ModelResponse, *, allow_partial: bool = False ) -> OutputDataT ``` Validate a structured result message. Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python async def validate_response_output( self, message: _messages.ModelResponse, *, allow_partial: bool = False ) -> OutputDataT: """Validate a structured result message.""" if self._run_result is not None: return self._run_result.output elif self._stream_response is not None: return await self._stream_response.validate_response_output(message, allow_partial=allow_partial) else: raise ValueError('No stream response or run result provided') # pragma: no cover ``` # `pydantic_ai.retries` Retries utilities based on tenacity, especially for HTTP requests. This module provides HTTP transport wrappers and wait strategies that integrate with the tenacity library to add retry capabilities to HTTP requests. The transports can be used with HTTP clients that support custom transports (such as httpx), while the wait strategies can be used with any tenacity retry decorator. The module includes: - TenacityTransport: Synchronous HTTP transport with retry capabilities - AsyncTenacityTransport: Asynchronous HTTP transport with retry capabilities - wait_retry_after: Wait strategy that respects HTTP Retry-After headers ### RetryConfig Bases: `TypedDict` The configuration for tenacity-based retrying. These are precisely the arguments to the tenacity `retry` decorator, and they are generally used internally by passing them to that decorator via `@retry(**config)` or similar. All fields are optional, and if not provided, the default values from the `tenacity.retry` decorator will be used. Source code in `pydantic_ai_slim/pydantic_ai/retries.py` ```python class RetryConfig(TypedDict, total=False): """The configuration for tenacity-based retrying. These are precisely the arguments to the tenacity `retry` decorator, and they are generally used internally by passing them to that decorator via `@retry(**config)` or similar. All fields are optional, and if not provided, the default values from the `tenacity.retry` decorator will be used. """ sleep: Callable[[int | float], None | Awaitable[None]] """A sleep strategy to use for sleeping between retries. Tenacity's default for this argument is `tenacity.nap.sleep`.""" stop: StopBaseT """ A stop strategy to determine when to stop retrying. Tenacity's default for this argument is `tenacity.stop.stop_never`.""" wait: WaitBaseT """ A wait strategy to determine how long to wait between retries. Tenacity's default for this argument is `tenacity.wait.wait_none`.""" retry: SyncRetryBaseT | RetryBaseT """A retry strategy to determine which exceptions should trigger a retry. Tenacity's default for this argument is `tenacity.retry.retry_if_exception_type()`.""" before: Callable[[RetryCallState], None | Awaitable[None]] """ A callable that is called before each retry attempt. Tenacity's default for this argument is `tenacity.before.before_nothing`.""" after: Callable[[RetryCallState], None | Awaitable[None]] """ A callable that is called after each retry attempt. Tenacity's default for this argument is `tenacity.after.after_nothing`.""" before_sleep: Callable[[RetryCallState], None | Awaitable[None]] | None """ An optional callable that is called before sleeping between retries. Tenacity's default for this argument is `None`.""" reraise: bool """Whether to reraise the last exception if the retry attempts are exhausted, or raise a RetryError instead. Tenacity's default for this argument is `False`.""" retry_error_cls: type[RetryError] """The exception class to raise when the retry attempts are exhausted and `reraise` is False. Tenacity's default for this argument is `tenacity.RetryError`.""" retry_error_callback: Callable[[RetryCallState], Any | Awaitable[Any]] | None """An optional callable that is called when the retry attempts are exhausted and `reraise` is False. Tenacity's default for this argument is `None`.""" ``` #### sleep ```python sleep: Callable[[int | float], None | Awaitable[None]] ``` A sleep strategy to use for sleeping between retries. Tenacity's default for this argument is `tenacity.nap.sleep`. #### stop ```python stop: StopBaseT ``` A stop strategy to determine when to stop retrying. Tenacity's default for this argument is `tenacity.stop.stop_never`. #### wait ```python wait: WaitBaseT ``` A wait strategy to determine how long to wait between retries. Tenacity's default for this argument is `tenacity.wait.wait_none`. #### retry ```python retry: RetryBaseT | RetryBaseT ``` A retry strategy to determine which exceptions should trigger a retry. Tenacity's default for this argument is `tenacity.retry.retry_if_exception_type()`. #### before ```python before: Callable[[RetryCallState], None | Awaitable[None]] ``` A callable that is called before each retry attempt. Tenacity's default for this argument is `tenacity.before.before_nothing`. #### after ```python after: Callable[[RetryCallState], None | Awaitable[None]] ``` A callable that is called after each retry attempt. Tenacity's default for this argument is `tenacity.after.after_nothing`. #### before_sleep ```python before_sleep: ( Callable[[RetryCallState], None | Awaitable[None]] | None ) ``` An optional callable that is called before sleeping between retries. Tenacity's default for this argument is `None`. #### reraise ```python reraise: bool ``` Whether to reraise the last exception if the retry attempts are exhausted, or raise a RetryError instead. Tenacity's default for this argument is `False`. #### retry_error_cls ```python retry_error_cls: type[RetryError] ``` The exception class to raise when the retry attempts are exhausted and `reraise` is False. Tenacity's default for this argument is `tenacity.RetryError`. #### retry_error_callback ```python retry_error_callback: ( Callable[[RetryCallState], Any | Awaitable[Any]] | None ) ``` An optional callable that is called when the retry attempts are exhausted and `reraise` is False. Tenacity's default for this argument is `None`. ### TenacityTransport Bases: `BaseTransport` Synchronous HTTP transport with tenacity-based retry functionality. This transport wraps another BaseTransport and adds retry capabilities using the tenacity library. It can be configured to retry requests based on various conditions such as specific exception types, response status codes, or custom validation logic. The transport works by intercepting HTTP requests and responses, allowing the tenacity controller to determine when and how to retry failed requests. The validate_response function can be used to convert HTTP responses into exceptions that trigger retries. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `wrapped` | `BaseTransport | None` | The underlying transport to wrap and add retry functionality to. | `None` | | `config` | `RetryConfig` | The arguments to use for the tenacity retry decorator, including retry conditions, wait strategy, stop conditions, etc. See the tenacity docs for more info. | *required* | | `validate_response` | `Callable[[Response], Any] | None` | Optional callable that takes a Response and can raise an exception to be handled by the controller if the response should trigger a retry. Common use case is to raise exceptions for certain HTTP status codes. If None, no response validation is performed. | `None` | Example ```python from httpx import Client, HTTPStatusError, HTTPTransport from tenacity import retry_if_exception_type, stop_after_attempt from pydantic_ai.retries import RetryConfig, TenacityTransport, wait_retry_after transport = TenacityTransport( RetryConfig( retry=retry_if_exception_type(HTTPStatusError), wait=wait_retry_after(max_wait=300), stop=stop_after_attempt(5), reraise=True ), HTTPTransport(), validate_response=lambda r: r.raise_for_status() ) client = Client(transport=transport) ``` Source code in `pydantic_ai_slim/pydantic_ai/retries.py` ````python class TenacityTransport(BaseTransport): """Synchronous HTTP transport with tenacity-based retry functionality. This transport wraps another BaseTransport and adds retry capabilities using the tenacity library. It can be configured to retry requests based on various conditions such as specific exception types, response status codes, or custom validation logic. The transport works by intercepting HTTP requests and responses, allowing the tenacity controller to determine when and how to retry failed requests. The validate_response function can be used to convert HTTP responses into exceptions that trigger retries. Args: wrapped: The underlying transport to wrap and add retry functionality to. config: The arguments to use for the tenacity `retry` decorator, including retry conditions, wait strategy, stop conditions, etc. See the tenacity docs for more info. validate_response: Optional callable that takes a Response and can raise an exception to be handled by the controller if the response should trigger a retry. Common use case is to raise exceptions for certain HTTP status codes. If None, no response validation is performed. Example: ```python from httpx import Client, HTTPStatusError, HTTPTransport from tenacity import retry_if_exception_type, stop_after_attempt from pydantic_ai.retries import RetryConfig, TenacityTransport, wait_retry_after transport = TenacityTransport( RetryConfig( retry=retry_if_exception_type(HTTPStatusError), wait=wait_retry_after(max_wait=300), stop=stop_after_attempt(5), reraise=True ), HTTPTransport(), validate_response=lambda r: r.raise_for_status() ) client = Client(transport=transport) ``` """ def __init__( self, config: RetryConfig, wrapped: BaseTransport | None = None, validate_response: Callable[[Response], Any] | None = None, ): self.config = config self.wrapped = wrapped or HTTPTransport() self.validate_response = validate_response def handle_request(self, request: Request) -> Response: """Handle an HTTP request with retry logic. Args: request: The HTTP request to handle. Returns: The HTTP response. Raises: RuntimeError: If the retry controller did not make any attempts. Exception: Any exception raised by the wrapped transport or validation function. """ @retry(**self.config) def handle_request(req: Request) -> Response: response = self.wrapped.handle_request(req) # this is normally set by httpx _after_ calling this function, but we want the request in the validator: response.request = req if self.validate_response: try: self.validate_response(response) except Exception: response.close() raise return response return handle_request(request) def __enter__(self) -> TenacityTransport: self.wrapped.__enter__() return self def __exit__( self, exc_type: type[BaseException] | None = None, exc_value: BaseException | None = None, traceback: TracebackType | None = None, ) -> None: self.wrapped.__exit__(exc_type, exc_value, traceback) def close(self) -> None: self.wrapped.close() # pragma: no cover ```` #### handle_request ```python handle_request(request: Request) -> Response ``` Handle an HTTP request with retry logic. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `request` | `Request` | The HTTP request to handle. | *required* | Returns: | Type | Description | | --- | --- | | `Response` | The HTTP response. | Raises: | Type | Description | | --- | --- | | `RuntimeError` | If the retry controller did not make any attempts. | | `Exception` | Any exception raised by the wrapped transport or validation function. | Source code in `pydantic_ai_slim/pydantic_ai/retries.py` ```python def handle_request(self, request: Request) -> Response: """Handle an HTTP request with retry logic. Args: request: The HTTP request to handle. Returns: The HTTP response. Raises: RuntimeError: If the retry controller did not make any attempts. Exception: Any exception raised by the wrapped transport or validation function. """ @retry(**self.config) def handle_request(req: Request) -> Response: response = self.wrapped.handle_request(req) # this is normally set by httpx _after_ calling this function, but we want the request in the validator: response.request = req if self.validate_response: try: self.validate_response(response) except Exception: response.close() raise return response return handle_request(request) ``` ### AsyncTenacityTransport Bases: `AsyncBaseTransport` Asynchronous HTTP transport with tenacity-based retry functionality. This transport wraps another AsyncBaseTransport and adds retry capabilities using the tenacity library. It can be configured to retry requests based on various conditions such as specific exception types, response status codes, or custom validation logic. The transport works by intercepting HTTP requests and responses, allowing the tenacity controller to determine when and how to retry failed requests. The validate_response function can be used to convert HTTP responses into exceptions that trigger retries. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `wrapped` | `AsyncBaseTransport | None` | The underlying async transport to wrap and add retry functionality to. | `None` | | `config` | `RetryConfig` | The arguments to use for the tenacity retry decorator, including retry conditions, wait strategy, stop conditions, etc. See the tenacity docs for more info. | *required* | | `validate_response` | `Callable[[Response], Any] | None` | Optional callable that takes a Response and can raise an exception to be handled by the controller if the response should trigger a retry. Common use case is to raise exceptions for certain HTTP status codes. If None, no response validation is performed. | `None` | Example ```python from httpx import AsyncClient, HTTPStatusError from tenacity import retry_if_exception_type, stop_after_attempt from pydantic_ai.retries import AsyncTenacityTransport, RetryConfig, wait_retry_after transport = AsyncTenacityTransport( RetryConfig( retry=retry_if_exception_type(HTTPStatusError), wait=wait_retry_after(max_wait=300), stop=stop_after_attempt(5), reraise=True ), validate_response=lambda r: r.raise_for_status() ) client = AsyncClient(transport=transport) ``` Source code in `pydantic_ai_slim/pydantic_ai/retries.py` ````python class AsyncTenacityTransport(AsyncBaseTransport): """Asynchronous HTTP transport with tenacity-based retry functionality. This transport wraps another AsyncBaseTransport and adds retry capabilities using the tenacity library. It can be configured to retry requests based on various conditions such as specific exception types, response status codes, or custom validation logic. The transport works by intercepting HTTP requests and responses, allowing the tenacity controller to determine when and how to retry failed requests. The validate_response function can be used to convert HTTP responses into exceptions that trigger retries. Args: wrapped: The underlying async transport to wrap and add retry functionality to. config: The arguments to use for the tenacity `retry` decorator, including retry conditions, wait strategy, stop conditions, etc. See the tenacity docs for more info. validate_response: Optional callable that takes a Response and can raise an exception to be handled by the controller if the response should trigger a retry. Common use case is to raise exceptions for certain HTTP status codes. If None, no response validation is performed. Example: ```python from httpx import AsyncClient, HTTPStatusError from tenacity import retry_if_exception_type, stop_after_attempt from pydantic_ai.retries import AsyncTenacityTransport, RetryConfig, wait_retry_after transport = AsyncTenacityTransport( RetryConfig( retry=retry_if_exception_type(HTTPStatusError), wait=wait_retry_after(max_wait=300), stop=stop_after_attempt(5), reraise=True ), validate_response=lambda r: r.raise_for_status() ) client = AsyncClient(transport=transport) ``` """ def __init__( self, config: RetryConfig, wrapped: AsyncBaseTransport | None = None, validate_response: Callable[[Response], Any] | None = None, ): self.config = config self.wrapped = wrapped or AsyncHTTPTransport() self.validate_response = validate_response async def handle_async_request(self, request: Request) -> Response: """Handle an async HTTP request with retry logic. Args: request: The HTTP request to handle. Returns: The HTTP response. Raises: RuntimeError: If the retry controller did not make any attempts. Exception: Any exception raised by the wrapped transport or validation function. """ @retry(**self.config) async def handle_async_request(req: Request) -> Response: response = await self.wrapped.handle_async_request(req) # this is normally set by httpx _after_ calling this function, but we want the request in the validator: response.request = req if self.validate_response: try: self.validate_response(response) except Exception: await response.aclose() raise return response return await handle_async_request(request) async def __aenter__(self) -> AsyncTenacityTransport: await self.wrapped.__aenter__() return self async def __aexit__( self, exc_type: type[BaseException] | None = None, exc_value: BaseException | None = None, traceback: TracebackType | None = None, ) -> None: await self.wrapped.__aexit__(exc_type, exc_value, traceback) async def aclose(self) -> None: await self.wrapped.aclose() ```` #### handle_async_request ```python handle_async_request(request: Request) -> Response ``` Handle an async HTTP request with retry logic. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `request` | `Request` | The HTTP request to handle. | *required* | Returns: | Type | Description | | --- | --- | | `Response` | The HTTP response. | Raises: | Type | Description | | --- | --- | | `RuntimeError` | If the retry controller did not make any attempts. | | `Exception` | Any exception raised by the wrapped transport or validation function. | Source code in `pydantic_ai_slim/pydantic_ai/retries.py` ```python async def handle_async_request(self, request: Request) -> Response: """Handle an async HTTP request with retry logic. Args: request: The HTTP request to handle. Returns: The HTTP response. Raises: RuntimeError: If the retry controller did not make any attempts. Exception: Any exception raised by the wrapped transport or validation function. """ @retry(**self.config) async def handle_async_request(req: Request) -> Response: response = await self.wrapped.handle_async_request(req) # this is normally set by httpx _after_ calling this function, but we want the request in the validator: response.request = req if self.validate_response: try: self.validate_response(response) except Exception: await response.aclose() raise return response return await handle_async_request(request) ``` ### wait_retry_after ```python wait_retry_after( fallback_strategy: ( Callable[[RetryCallState], float] | None ) = None, max_wait: float = 300, ) -> Callable[[RetryCallState], float] ``` Create a tenacity-compatible wait strategy that respects HTTP Retry-After headers. This wait strategy checks if the exception contains an HTTPStatusError with a Retry-After header, and if so, waits for the time specified in the header. If no header is present or parsing fails, it falls back to the provided strategy. The Retry-After header can be in two formats: - An integer representing seconds to wait - An HTTP date string representing when to retry Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `fallback_strategy` | `Callable[[RetryCallState], float] | None` | Wait strategy to use when no Retry-After header is present or parsing fails. Defaults to exponential backoff with max 60s. | `None` | | `max_wait` | `float` | Maximum time to wait in seconds, regardless of header value. Defaults to 300 (5 minutes). | `300` | Returns: | Type | Description | | --- | --- | | `Callable[[RetryCallState], float]` | A wait function that can be used with tenacity retry decorators. | Example ```python from httpx import AsyncClient, HTTPStatusError from tenacity import retry_if_exception_type, stop_after_attempt from pydantic_ai.retries import AsyncTenacityTransport, RetryConfig, wait_retry_after transport = AsyncTenacityTransport( RetryConfig( retry=retry_if_exception_type(HTTPStatusError), wait=wait_retry_after(max_wait=120), stop=stop_after_attempt(5), reraise=True ), validate_response=lambda r: r.raise_for_status() ) client = AsyncClient(transport=transport) ``` Source code in `pydantic_ai_slim/pydantic_ai/retries.py` ````python def wait_retry_after( fallback_strategy: Callable[[RetryCallState], float] | None = None, max_wait: float = 300 ) -> Callable[[RetryCallState], float]: """Create a tenacity-compatible wait strategy that respects HTTP Retry-After headers. This wait strategy checks if the exception contains an HTTPStatusError with a Retry-After header, and if so, waits for the time specified in the header. If no header is present or parsing fails, it falls back to the provided strategy. The Retry-After header can be in two formats: - An integer representing seconds to wait - An HTTP date string representing when to retry Args: fallback_strategy: Wait strategy to use when no Retry-After header is present or parsing fails. Defaults to exponential backoff with max 60s. max_wait: Maximum time to wait in seconds, regardless of header value. Defaults to 300 (5 minutes). Returns: A wait function that can be used with tenacity retry decorators. Example: ```python from httpx import AsyncClient, HTTPStatusError from tenacity import retry_if_exception_type, stop_after_attempt from pydantic_ai.retries import AsyncTenacityTransport, RetryConfig, wait_retry_after transport = AsyncTenacityTransport( RetryConfig( retry=retry_if_exception_type(HTTPStatusError), wait=wait_retry_after(max_wait=120), stop=stop_after_attempt(5), reraise=True ), validate_response=lambda r: r.raise_for_status() ) client = AsyncClient(transport=transport) ``` """ if fallback_strategy is None: fallback_strategy = wait_exponential(multiplier=1, max=60) def wait_func(state: RetryCallState) -> float: exc = state.outcome.exception() if state.outcome else None if isinstance(exc, HTTPStatusError): retry_after = exc.response.headers.get('retry-after') if retry_after: try: # Try parsing as seconds first wait_seconds = int(retry_after) return min(float(wait_seconds), max_wait) except ValueError: # Try parsing as HTTP date try: retry_time = cast(datetime, parsedate_to_datetime(retry_after)) assert isinstance(retry_time, datetime) now = datetime.now(timezone.utc) wait_seconds = (retry_time - now).total_seconds() if wait_seconds > 0: return min(wait_seconds, max_wait) except (ValueError, TypeError, AssertionError): # If date parsing fails, fall back to fallback strategy pass # Use fallback strategy return fallback_strategy(state) return wait_func ```` # `pydantic_ai.settings` ### ModelSettings Bases: `TypedDict` Settings to configure an LLM. Here we include only settings which apply to multiple models / model providers, though not all of these settings are supported by all models. Source code in `pydantic_ai_slim/pydantic_ai/settings.py` ```python class ModelSettings(TypedDict, total=False): """Settings to configure an LLM. Here we include only settings which apply to multiple models / model providers, though not all of these settings are supported by all models. """ max_tokens: int """The maximum number of tokens to generate before stopping. Supported by: * Gemini * Anthropic * OpenAI * Groq * Cohere * Mistral * Bedrock * MCP Sampling """ temperature: float """Amount of randomness injected into the response. Use `temperature` closer to `0.0` for analytical / multiple choice, and closer to a model's maximum `temperature` for creative and generative tasks. Note that even with `temperature` of `0.0`, the results will not be fully deterministic. Supported by: * Gemini * Anthropic * OpenAI * Groq * Cohere * Mistral * Bedrock """ top_p: float """An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. You should either alter `temperature` or `top_p`, but not both. Supported by: * Gemini * Anthropic * OpenAI * Groq * Cohere * Mistral * Bedrock """ timeout: float | Timeout """Override the client-level default timeout for a request, in seconds. Supported by: * Gemini * Anthropic * OpenAI * Groq * Mistral """ parallel_tool_calls: bool """Whether to allow parallel tool calls. Supported by: * OpenAI (some models, not o1) * Groq * Anthropic """ seed: int """The random seed to use for the model, theoretically allowing for deterministic results. Supported by: * OpenAI * Groq * Cohere * Mistral * Gemini """ presence_penalty: float """Penalize new tokens based on whether they have appeared in the text so far. Supported by: * OpenAI * Groq * Cohere * Gemini * Mistral """ frequency_penalty: float """Penalize new tokens based on their existing frequency in the text so far. Supported by: * OpenAI * Groq * Cohere * Gemini * Mistral """ logit_bias: dict[str, int] """Modify the likelihood of specified tokens appearing in the completion. Supported by: * OpenAI * Groq """ stop_sequences: list[str] """Sequences that will cause the model to stop generating. Supported by: * OpenAI * Anthropic * Bedrock * Mistral * Groq * Cohere * Google """ extra_headers: dict[str, str] """Extra headers to send to the model. Supported by: * OpenAI * Anthropic * Groq """ extra_body: object """Extra body to send to the model. Supported by: * OpenAI * Anthropic * Groq """ ``` #### max_tokens ```python max_tokens: int ``` The maximum number of tokens to generate before stopping. Supported by: - Gemini - Anthropic - OpenAI - Groq - Cohere - Mistral - Bedrock - MCP Sampling #### temperature ```python temperature: float ``` Amount of randomness injected into the response. Use `temperature` closer to `0.0` for analytical / multiple choice, and closer to a model's maximum `temperature` for creative and generative tasks. Note that even with `temperature` of `0.0`, the results will not be fully deterministic. Supported by: - Gemini - Anthropic - OpenAI - Groq - Cohere - Mistral - Bedrock #### top_p ```python top_p: float ``` An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. You should either alter `temperature` or `top_p`, but not both. Supported by: - Gemini - Anthropic - OpenAI - Groq - Cohere - Mistral - Bedrock #### timeout ```python timeout: float | Timeout ``` Override the client-level default timeout for a request, in seconds. Supported by: - Gemini - Anthropic - OpenAI - Groq - Mistral #### parallel_tool_calls ```python parallel_tool_calls: bool ``` Whether to allow parallel tool calls. Supported by: - OpenAI (some models, not o1) - Groq - Anthropic #### seed ```python seed: int ``` The random seed to use for the model, theoretically allowing for deterministic results. Supported by: - OpenAI - Groq - Cohere - Mistral - Gemini #### presence_penalty ```python presence_penalty: float ``` Penalize new tokens based on whether they have appeared in the text so far. Supported by: - OpenAI - Groq - Cohere - Gemini - Mistral #### frequency_penalty ```python frequency_penalty: float ``` Penalize new tokens based on their existing frequency in the text so far. Supported by: - OpenAI - Groq - Cohere - Gemini - Mistral #### logit_bias ```python logit_bias: dict[str, int] ``` Modify the likelihood of specified tokens appearing in the completion. Supported by: - OpenAI - Groq #### stop_sequences ```python stop_sequences: list[str] ``` Sequences that will cause the model to stop generating. Supported by: - OpenAI - Anthropic - Bedrock - Mistral - Groq - Cohere - Google #### extra_headers ```python extra_headers: dict[str, str] ``` Extra headers to send to the model. Supported by: - OpenAI - Anthropic - Groq #### extra_body ```python extra_body: object ``` Extra body to send to the model. Supported by: - OpenAI - Anthropic - Groq # `pydantic_ai.tools` ### AgentDepsT ```python AgentDepsT = TypeVar( "AgentDepsT", default=None, contravariant=True ) ``` Type variable for agent dependencies. ### RunContext Bases: `Generic[AgentDepsT]` Information about the current call. Source code in `pydantic_ai_slim/pydantic_ai/_run_context.py` ```python @dataclasses.dataclass(repr=False, kw_only=True) class RunContext(Generic[AgentDepsT]): """Information about the current call.""" deps: AgentDepsT """Dependencies for the agent.""" model: Model """The model used in this run.""" usage: RunUsage """LLM usage associated with the run.""" prompt: str | Sequence[_messages.UserContent] | None = None """The original user prompt passed to the run.""" messages: list[_messages.ModelMessage] = field(default_factory=list) """Messages exchanged in the conversation so far.""" tracer: Tracer = field(default_factory=NoOpTracer) """The tracer to use for tracing the run.""" trace_include_content: bool = False """Whether to include the content of the messages in the trace.""" retries: dict[str, int] = field(default_factory=dict) """Number of retries for each tool so far.""" tool_call_id: str | None = None """The ID of the tool call.""" tool_name: str | None = None """Name of the tool being called.""" retry: int = 0 """Number of retries so far.""" run_step: int = 0 """The current step in the run.""" tool_call_approved: bool = False """Whether a tool call that required approval has now been approved.""" __repr__ = _utils.dataclasses_no_defaults_repr ``` #### deps ```python deps: AgentDepsT ``` Dependencies for the agent. #### model ```python model: Model ``` The model used in this run. #### usage ```python usage: RunUsage ``` LLM usage associated with the run. #### prompt ```python prompt: str | Sequence[UserContent] | None = None ``` The original user prompt passed to the run. #### messages ```python messages: list[ModelMessage] = field(default_factory=list) ``` Messages exchanged in the conversation so far. #### tracer ```python tracer: Tracer = field(default_factory=NoOpTracer) ``` The tracer to use for tracing the run. #### trace_include_content ```python trace_include_content: bool = False ``` Whether to include the content of the messages in the trace. #### retries ```python retries: dict[str, int] = field(default_factory=dict) ``` Number of retries for each tool so far. #### tool_call_id ```python tool_call_id: str | None = None ``` The ID of the tool call. #### tool_name ```python tool_name: str | None = None ``` Name of the tool being called. #### retry ```python retry: int = 0 ``` Number of retries so far. #### run_step ```python run_step: int = 0 ``` The current step in the run. #### tool_call_approved ```python tool_call_approved: bool = False ``` Whether a tool call that required approval has now been approved. ### ToolParams ```python ToolParams = ParamSpec('ToolParams', default=...) ``` Retrieval function param spec. ### SystemPromptFunc ```python SystemPromptFunc: TypeAlias = ( Callable[[RunContext[AgentDepsT]], str] | Callable[[RunContext[AgentDepsT]], Awaitable[str]] | Callable[[], str] | Callable[[], Awaitable[str]] ) ``` A function that may or maybe not take `RunContext` as an argument, and may or may not be async. Usage `SystemPromptFunc[AgentDepsT]`. ### ToolFuncContext ```python ToolFuncContext: TypeAlias = Callable[ Concatenate[RunContext[AgentDepsT], ToolParams], Any ] ``` A tool function that takes `RunContext` as the first argument. Usage `ToolContextFunc[AgentDepsT, ToolParams]`. ### ToolFuncPlain ```python ToolFuncPlain: TypeAlias = Callable[ToolParams, Any] ``` A tool function that does not take `RunContext` as the first argument. Usage `ToolPlainFunc[ToolParams]`. ### ToolFuncEither ```python ToolFuncEither: TypeAlias = ( ToolFuncContext[AgentDepsT, ToolParams] | ToolFuncPlain[ToolParams] ) ``` Either kind of tool function. This is just a union of ToolFuncContext and ToolFuncPlain. Usage `ToolFuncEither[AgentDepsT, ToolParams]`. ### ToolPrepareFunc ```python ToolPrepareFunc: TypeAlias = Callable[ [RunContext[AgentDepsT], "ToolDefinition"], Awaitable["ToolDefinition | None"], ] ``` Definition of a function that can prepare a tool definition at call time. See [tool docs](../../tools-advanced/#tool-prepare) for more information. Example — here `only_if_42` is valid as a `ToolPrepareFunc`: ```python from pydantic_ai import RunContext, Tool from pydantic_ai.tools import ToolDefinition async def only_if_42( ctx: RunContext[int], tool_def: ToolDefinition ) -> ToolDefinition | None: if ctx.deps == 42: return tool_def def hitchhiker(ctx: RunContext[int], answer: str) -> str: return f'{ctx.deps} {answer}' hitchhiker = Tool(hitchhiker, prepare=only_if_42) ``` Usage `ToolPrepareFunc[AgentDepsT]`. ### ToolsPrepareFunc ```python ToolsPrepareFunc: TypeAlias = Callable[ [RunContext[AgentDepsT], list["ToolDefinition"]], Awaitable["list[ToolDefinition] | None"], ] ``` Definition of a function that can prepare the tool definition of all tools for each step. This is useful if you want to customize the definition of multiple tools or you want to register a subset of tools for a given step. Example — here `turn_on_strict_if_openai` is valid as a `ToolsPrepareFunc`: ```python from dataclasses import replace from pydantic_ai import Agent, RunContext from pydantic_ai.tools import ToolDefinition async def turn_on_strict_if_openai( ctx: RunContext[None], tool_defs: list[ToolDefinition] ) -> list[ToolDefinition] | None: if ctx.model.system == 'openai': return [replace(tool_def, strict=True) for tool_def in tool_defs] return tool_defs agent = Agent('openai:gpt-4o', prepare_tools=turn_on_strict_if_openai) ``` Usage `ToolsPrepareFunc[AgentDepsT]`. ### DocstringFormat ```python DocstringFormat: TypeAlias = Literal[ "google", "numpy", "sphinx", "auto" ] ``` Supported docstring formats. - `'google'` — [Google-style](https://google.github.io/styleguide/pyguide.html#381-docstrings) docstrings. - `'numpy'` — [Numpy-style](https://numpydoc.readthedocs.io/en/latest/format.html) docstrings. - `'sphinx'` — [Sphinx-style](https://sphinx-rtd-tutorial.readthedocs.io/en/latest/docstrings.html#the-sphinx-docstring-format) docstrings. - `'auto'` — Automatically infer the format based on the structure of the docstring. ### DeferredToolRequests Tool calls that require approval or external execution. This can be used as an agent's `output_type` and will be used as the output of the agent run if the model called any deferred tools. Results can be passed to the next agent run using a DeferredToolResults object with the same tool call IDs. See [deferred tools docs](../../deferred-tools/#deferred-tools) for more information. Source code in `pydantic_ai_slim/pydantic_ai/tools.py` ```python @dataclass(kw_only=True) class DeferredToolRequests: """Tool calls that require approval or external execution. This can be used as an agent's `output_type` and will be used as the output of the agent run if the model called any deferred tools. Results can be passed to the next agent run using a [`DeferredToolResults`][pydantic_ai.tools.DeferredToolResults] object with the same tool call IDs. See [deferred tools docs](../deferred-tools.md#deferred-tools) for more information. """ calls: list[ToolCallPart] = field(default_factory=list) """Tool calls that require external execution.""" approvals: list[ToolCallPart] = field(default_factory=list) """Tool calls that require human-in-the-loop approval.""" ``` #### calls ```python calls: list[ToolCallPart] = field(default_factory=list) ``` Tool calls that require external execution. #### approvals ```python approvals: list[ToolCallPart] = field(default_factory=list) ``` Tool calls that require human-in-the-loop approval. ### ToolApproved Indicates that a tool call has been approved and that the tool function should be executed. Source code in `pydantic_ai_slim/pydantic_ai/tools.py` ```python @dataclass(kw_only=True) class ToolApproved: """Indicates that a tool call has been approved and that the tool function should be executed.""" override_args: dict[str, Any] | None = None """Optional tool call arguments to use instead of the original arguments.""" kind: Literal['tool-approved'] = 'tool-approved' ``` #### override_args ```python override_args: dict[str, Any] | None = None ``` Optional tool call arguments to use instead of the original arguments. ### ToolDenied Indicates that a tool call has been denied and that a denial message should be returned to the model. Source code in `pydantic_ai_slim/pydantic_ai/tools.py` ```python @dataclass class ToolDenied: """Indicates that a tool call has been denied and that a denial message should be returned to the model.""" message: str = 'The tool call was denied.' """The message to return to the model.""" _: KW_ONLY kind: Literal['tool-denied'] = 'tool-denied' ``` #### message ```python message: str = 'The tool call was denied.' ``` The message to return to the model. ### DeferredToolResults Results for deferred tool calls from a previous run that required approval or external execution. The tool call IDs need to match those from the DeferredToolRequests output object from the previous run. See [deferred tools docs](../../deferred-tools/#deferred-tools) for more information. Source code in `pydantic_ai_slim/pydantic_ai/tools.py` ```python @dataclass(kw_only=True) class DeferredToolResults: """Results for deferred tool calls from a previous run that required approval or external execution. The tool call IDs need to match those from the [`DeferredToolRequests`][pydantic_ai.output.DeferredToolRequests] output object from the previous run. See [deferred tools docs](../deferred-tools.md#deferred-tools) for more information. """ calls: dict[str, DeferredToolCallResult | Any] = field(default_factory=dict) """Map of tool call IDs to results for tool calls that required external execution.""" approvals: dict[str, bool | DeferredToolApprovalResult] = field(default_factory=dict) """Map of tool call IDs to results for tool calls that required human-in-the-loop approval.""" ``` #### calls ```python calls: dict[str, DeferredToolCallResult | Any] = field( default_factory=dict ) ``` Map of tool call IDs to results for tool calls that required external execution. #### approvals ```python approvals: dict[str, bool | DeferredToolApprovalResult] = ( field(default_factory=dict) ) ``` Map of tool call IDs to results for tool calls that required human-in-the-loop approval. ### Tool Bases: `Generic[AgentDepsT]` A tool function for an agent. Source code in `pydantic_ai_slim/pydantic_ai/tools.py` ````python @dataclass(init=False) class Tool(Generic[AgentDepsT]): """A tool function for an agent.""" function: ToolFuncEither[AgentDepsT] takes_ctx: bool max_retries: int | None name: str description: str | None prepare: ToolPrepareFunc[AgentDepsT] | None docstring_format: DocstringFormat require_parameter_descriptions: bool strict: bool | None sequential: bool requires_approval: bool function_schema: _function_schema.FunctionSchema """ The base JSON schema for the tool's parameters. This schema may be modified by the `prepare` function or by the Model class prior to including it in an API request. """ def __init__( self, function: ToolFuncEither[AgentDepsT], *, takes_ctx: bool | None = None, max_retries: int | None = None, name: str | None = None, description: str | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat = 'auto', require_parameter_descriptions: bool = False, schema_generator: type[GenerateJsonSchema] = GenerateToolJsonSchema, strict: bool | None = None, sequential: bool = False, requires_approval: bool = False, function_schema: _function_schema.FunctionSchema | None = None, ): """Create a new tool instance. Example usage: ```python {noqa="I001"} from pydantic_ai import Agent, RunContext, Tool async def my_tool(ctx: RunContext[int], x: int, y: int) -> str: return f'{ctx.deps} {x} {y}' agent = Agent('test', tools=[Tool(my_tool)]) ``` or with a custom prepare method: ```python {noqa="I001"} from pydantic_ai import Agent, RunContext, Tool from pydantic_ai.tools import ToolDefinition async def my_tool(ctx: RunContext[int], x: int, y: int) -> str: return f'{ctx.deps} {x} {y}' async def prep_my_tool( ctx: RunContext[int], tool_def: ToolDefinition ) -> ToolDefinition | None: # only register the tool if `deps == 42` if ctx.deps == 42: return tool_def agent = Agent('test', tools=[Tool(my_tool, prepare=prep_my_tool)]) ``` Args: function: The Python function to call as the tool. takes_ctx: Whether the function takes a [`RunContext`][pydantic_ai.tools.RunContext] first argument, this is inferred if unset. max_retries: Maximum number of retries allowed for this tool, set to the agent default if `None`. name: Name of the tool, inferred from the function if `None`. description: Description of the tool, inferred from the function if `None`. prepare: custom method to prepare the tool definition for each step, return `None` to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See [`ToolPrepareFunc`][pydantic_ai.tools.ToolPrepareFunc]. docstring_format: The format of the docstring, see [`DocstringFormat`][pydantic_ai.tools.DocstringFormat]. Defaults to `'auto'`, such that the format is inferred from the structure of the docstring. require_parameter_descriptions: If True, raise an error if a parameter description is missing. Defaults to False. schema_generator: The JSON schema generator class to use. Defaults to `GenerateToolJsonSchema`. strict: Whether to enforce JSON schema compliance (only affects OpenAI). See [`ToolDefinition`][pydantic_ai.tools.ToolDefinition] for more info. sequential: Whether the function requires a sequential/serial execution environment. Defaults to False. requires_approval: Whether this tool requires human-in-the-loop approval. Defaults to False. See the [tools documentation](../deferred-tools.md#human-in-the-loop-tool-approval) for more info. function_schema: The function schema to use for the tool. If not provided, it will be generated. """ self.function = function self.function_schema = function_schema or _function_schema.function_schema( function, schema_generator, takes_ctx=takes_ctx, docstring_format=docstring_format, require_parameter_descriptions=require_parameter_descriptions, ) self.takes_ctx = self.function_schema.takes_ctx self.max_retries = max_retries self.name = name or function.__name__ self.description = description or self.function_schema.description self.prepare = prepare self.docstring_format = docstring_format self.require_parameter_descriptions = require_parameter_descriptions self.strict = strict self.sequential = sequential self.requires_approval = requires_approval @classmethod def from_schema( cls, function: Callable[..., Any], name: str, description: str | None, json_schema: JsonSchemaValue, takes_ctx: bool = False, sequential: bool = False, ) -> Self: """Creates a Pydantic tool from a function and a JSON schema. Args: function: The function to call. This will be called with keywords only, and no validation of the arguments will be performed. name: The unique name of the tool that clearly communicates its purpose description: Used to tell the model how/when/why to use the tool. You can provide few-shot examples as a part of the description. json_schema: The schema for the function arguments takes_ctx: An optional boolean parameter indicating whether the function accepts the context object as an argument. sequential: Whether the function requires a sequential/serial execution environment. Defaults to False. Returns: A Pydantic tool that calls the function """ function_schema = _function_schema.FunctionSchema( function=function, description=description, validator=SchemaValidator(schema=core_schema.any_schema()), json_schema=json_schema, takes_ctx=takes_ctx, is_async=_utils.is_async_callable(function), ) return cls( function, takes_ctx=takes_ctx, name=name, description=description, function_schema=function_schema, sequential=sequential, ) @property def tool_def(self): return ToolDefinition( name=self.name, description=self.description, parameters_json_schema=self.function_schema.json_schema, strict=self.strict, sequential=self.sequential, ) async def prepare_tool_def(self, ctx: RunContext[AgentDepsT]) -> ToolDefinition | None: """Get the tool definition. By default, this method creates a tool definition, then either returns it, or calls `self.prepare` if it's set. Returns: return a `ToolDefinition` or `None` if the tools should not be registered for this run. """ base_tool_def = self.tool_def if self.requires_approval and not ctx.tool_call_approved: base_tool_def = replace(base_tool_def, kind='unapproved') if self.prepare is not None: return await self.prepare(ctx, base_tool_def) else: return base_tool_def ```` #### __init__ ```python __init__( function: ToolFuncEither[AgentDepsT], *, takes_ctx: bool | None = None, max_retries: int | None = None, name: str | None = None, description: str | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat = "auto", require_parameter_descriptions: bool = False, schema_generator: type[ GenerateJsonSchema ] = GenerateToolJsonSchema, strict: bool | None = None, sequential: bool = False, requires_approval: bool = False, function_schema: FunctionSchema | None = None ) ``` Create a new tool instance. Example usage: ```python from pydantic_ai import Agent, RunContext, Tool async def my_tool(ctx: RunContext[int], x: int, y: int) -> str: return f'{ctx.deps} {x} {y}' agent = Agent('test', tools=[Tool(my_tool)]) ``` or with a custom prepare method: ```python from pydantic_ai import Agent, RunContext, Tool from pydantic_ai.tools import ToolDefinition async def my_tool(ctx: RunContext[int], x: int, y: int) -> str: return f'{ctx.deps} {x} {y}' async def prep_my_tool( ctx: RunContext[int], tool_def: ToolDefinition ) -> ToolDefinition | None: # only register the tool if `deps == 42` if ctx.deps == 42: return tool_def agent = Agent('test', tools=[Tool(my_tool, prepare=prep_my_tool)]) ``` Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `function` | `ToolFuncEither[AgentDepsT]` | The Python function to call as the tool. | *required* | | `takes_ctx` | `bool | None` | Whether the function takes a RunContext first argument, this is inferred if unset. | `None` | | `max_retries` | `int | None` | Maximum number of retries allowed for this tool, set to the agent default if None. | `None` | | `name` | `str | None` | Name of the tool, inferred from the function if None. | `None` | | `description` | `str | None` | Description of the tool, inferred from the function if None. | `None` | | `prepare` | `ToolPrepareFunc[AgentDepsT] | None` | custom method to prepare the tool definition for each step, return None to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See ToolPrepareFunc. | `None` | | `docstring_format` | `DocstringFormat` | The format of the docstring, see DocstringFormat. Defaults to 'auto', such that the format is inferred from the structure of the docstring. | `'auto'` | | `require_parameter_descriptions` | `bool` | If True, raise an error if a parameter description is missing. Defaults to False. | `False` | | `schema_generator` | `type[GenerateJsonSchema]` | The JSON schema generator class to use. Defaults to GenerateToolJsonSchema. | `GenerateToolJsonSchema` | | `strict` | `bool | None` | Whether to enforce JSON schema compliance (only affects OpenAI). See ToolDefinition for more info. | `None` | | `sequential` | `bool` | Whether the function requires a sequential/serial execution environment. Defaults to False. | `False` | | `requires_approval` | `bool` | Whether this tool requires human-in-the-loop approval. Defaults to False. See the tools documentation for more info. | `False` | | `function_schema` | `FunctionSchema | None` | The function schema to use for the tool. If not provided, it will be generated. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/tools.py` ````python def __init__( self, function: ToolFuncEither[AgentDepsT], *, takes_ctx: bool | None = None, max_retries: int | None = None, name: str | None = None, description: str | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat = 'auto', require_parameter_descriptions: bool = False, schema_generator: type[GenerateJsonSchema] = GenerateToolJsonSchema, strict: bool | None = None, sequential: bool = False, requires_approval: bool = False, function_schema: _function_schema.FunctionSchema | None = None, ): """Create a new tool instance. Example usage: ```python {noqa="I001"} from pydantic_ai import Agent, RunContext, Tool async def my_tool(ctx: RunContext[int], x: int, y: int) -> str: return f'{ctx.deps} {x} {y}' agent = Agent('test', tools=[Tool(my_tool)]) ``` or with a custom prepare method: ```python {noqa="I001"} from pydantic_ai import Agent, RunContext, Tool from pydantic_ai.tools import ToolDefinition async def my_tool(ctx: RunContext[int], x: int, y: int) -> str: return f'{ctx.deps} {x} {y}' async def prep_my_tool( ctx: RunContext[int], tool_def: ToolDefinition ) -> ToolDefinition | None: # only register the tool if `deps == 42` if ctx.deps == 42: return tool_def agent = Agent('test', tools=[Tool(my_tool, prepare=prep_my_tool)]) ``` Args: function: The Python function to call as the tool. takes_ctx: Whether the function takes a [`RunContext`][pydantic_ai.tools.RunContext] first argument, this is inferred if unset. max_retries: Maximum number of retries allowed for this tool, set to the agent default if `None`. name: Name of the tool, inferred from the function if `None`. description: Description of the tool, inferred from the function if `None`. prepare: custom method to prepare the tool definition for each step, return `None` to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See [`ToolPrepareFunc`][pydantic_ai.tools.ToolPrepareFunc]. docstring_format: The format of the docstring, see [`DocstringFormat`][pydantic_ai.tools.DocstringFormat]. Defaults to `'auto'`, such that the format is inferred from the structure of the docstring. require_parameter_descriptions: If True, raise an error if a parameter description is missing. Defaults to False. schema_generator: The JSON schema generator class to use. Defaults to `GenerateToolJsonSchema`. strict: Whether to enforce JSON schema compliance (only affects OpenAI). See [`ToolDefinition`][pydantic_ai.tools.ToolDefinition] for more info. sequential: Whether the function requires a sequential/serial execution environment. Defaults to False. requires_approval: Whether this tool requires human-in-the-loop approval. Defaults to False. See the [tools documentation](../deferred-tools.md#human-in-the-loop-tool-approval) for more info. function_schema: The function schema to use for the tool. If not provided, it will be generated. """ self.function = function self.function_schema = function_schema or _function_schema.function_schema( function, schema_generator, takes_ctx=takes_ctx, docstring_format=docstring_format, require_parameter_descriptions=require_parameter_descriptions, ) self.takes_ctx = self.function_schema.takes_ctx self.max_retries = max_retries self.name = name or function.__name__ self.description = description or self.function_schema.description self.prepare = prepare self.docstring_format = docstring_format self.require_parameter_descriptions = require_parameter_descriptions self.strict = strict self.sequential = sequential self.requires_approval = requires_approval ```` #### function_schema ```python function_schema: FunctionSchema = ( function_schema or function_schema( function, schema_generator, takes_ctx=takes_ctx, docstring_format=docstring_format, require_parameter_descriptions=require_parameter_descriptions, ) ) ``` The base JSON schema for the tool's parameters. This schema may be modified by the `prepare` function or by the Model class prior to including it in an API request. #### from_schema ```python from_schema( function: Callable[..., Any], name: str, description: str | None, json_schema: JsonSchemaValue, takes_ctx: bool = False, sequential: bool = False, ) -> Self ``` Creates a Pydantic tool from a function and a JSON schema. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `function` | `Callable[..., Any]` | The function to call. This will be called with keywords only, and no validation of the arguments will be performed. | *required* | | `name` | `str` | The unique name of the tool that clearly communicates its purpose | *required* | | `description` | `str | None` | Used to tell the model how/when/why to use the tool. You can provide few-shot examples as a part of the description. | *required* | | `json_schema` | `JsonSchemaValue` | The schema for the function arguments | *required* | | `takes_ctx` | `bool` | An optional boolean parameter indicating whether the function accepts the context object as an argument. | `False` | | `sequential` | `bool` | Whether the function requires a sequential/serial execution environment. Defaults to False. | `False` | Returns: | Type | Description | | --- | --- | | `Self` | A Pydantic tool that calls the function | Source code in `pydantic_ai_slim/pydantic_ai/tools.py` ```python @classmethod def from_schema( cls, function: Callable[..., Any], name: str, description: str | None, json_schema: JsonSchemaValue, takes_ctx: bool = False, sequential: bool = False, ) -> Self: """Creates a Pydantic tool from a function and a JSON schema. Args: function: The function to call. This will be called with keywords only, and no validation of the arguments will be performed. name: The unique name of the tool that clearly communicates its purpose description: Used to tell the model how/when/why to use the tool. You can provide few-shot examples as a part of the description. json_schema: The schema for the function arguments takes_ctx: An optional boolean parameter indicating whether the function accepts the context object as an argument. sequential: Whether the function requires a sequential/serial execution environment. Defaults to False. Returns: A Pydantic tool that calls the function """ function_schema = _function_schema.FunctionSchema( function=function, description=description, validator=SchemaValidator(schema=core_schema.any_schema()), json_schema=json_schema, takes_ctx=takes_ctx, is_async=_utils.is_async_callable(function), ) return cls( function, takes_ctx=takes_ctx, name=name, description=description, function_schema=function_schema, sequential=sequential, ) ``` #### prepare_tool_def ```python prepare_tool_def( ctx: RunContext[AgentDepsT], ) -> ToolDefinition | None ``` Get the tool definition. By default, this method creates a tool definition, then either returns it, or calls `self.prepare` if it's set. Returns: | Type | Description | | --- | --- | | `ToolDefinition | None` | return a ToolDefinition or None if the tools should not be registered for this run. | Source code in `pydantic_ai_slim/pydantic_ai/tools.py` ```python async def prepare_tool_def(self, ctx: RunContext[AgentDepsT]) -> ToolDefinition | None: """Get the tool definition. By default, this method creates a tool definition, then either returns it, or calls `self.prepare` if it's set. Returns: return a `ToolDefinition` or `None` if the tools should not be registered for this run. """ base_tool_def = self.tool_def if self.requires_approval and not ctx.tool_call_approved: base_tool_def = replace(base_tool_def, kind='unapproved') if self.prepare is not None: return await self.prepare(ctx, base_tool_def) else: return base_tool_def ``` ### ObjectJsonSchema ```python ObjectJsonSchema: TypeAlias = dict[str, Any] ``` Type representing JSON schema of an object, e.g. where `"type": "object"`. This type is used to define tools parameters (aka arguments) in ToolDefinition. With PEP-728 this should be a TypedDict with `type: Literal['object']`, and `extra_parts=Any` ### ToolDefinition Definition of a tool passed to a model. This is used for both function tools and output tools. Source code in `pydantic_ai_slim/pydantic_ai/tools.py` ```python @dataclass(repr=False, kw_only=True) class ToolDefinition: """Definition of a tool passed to a model. This is used for both function tools and output tools. """ name: str """The name of the tool.""" parameters_json_schema: ObjectJsonSchema = field(default_factory=lambda: {'type': 'object', 'properties': {}}) """The JSON schema for the tool's parameters.""" description: str | None = None """The description of the tool.""" outer_typed_dict_key: str | None = None """The key in the outer [TypedDict] that wraps an output tool. This will only be set for output tools which don't have an `object` JSON schema. """ strict: bool | None = None """Whether to enforce (vendor-specific) strict JSON schema validation for tool calls. Setting this to `True` while using a supported model generally imposes some restrictions on the tool's JSON schema in exchange for guaranteeing the API responses strictly match that schema. When `False`, the model may be free to generate other properties or types (depending on the vendor). When `None` (the default), the value will be inferred based on the compatibility of the parameters_json_schema. Note: this is currently only supported by OpenAI models. """ sequential: bool = False """Whether this tool requires a sequential/serial execution environment.""" kind: ToolKind = field(default='function') """The kind of tool: - `'function'`: a tool that will be executed by Pydantic AI during an agent run and has its result returned to the model - `'output'`: a tool that passes through an output value that ends the run - `'external'`: a tool whose result will be produced outside of the Pydantic AI agent run in which it was called, because it depends on an upstream service (or user) or could take longer to generate than it's reasonable to keep the agent process running. See the [tools documentation](../deferred-tools.md#deferred-tools) for more info. - `'unapproved'`: a tool that requires human-in-the-loop approval. See the [tools documentation](../deferred-tools.md#human-in-the-loop-tool-approval) for more info. """ @property def defer(self) -> bool: """Whether calls to this tool will be deferred. See the [tools documentation](../deferred-tools.md#deferred-tools) for more info. """ return self.kind in ('external', 'unapproved') __repr__ = _utils.dataclasses_no_defaults_repr ``` #### name ```python name: str ``` The name of the tool. #### parameters_json_schema ```python parameters_json_schema: ObjectJsonSchema = field( default_factory=lambda: { "type": "object", "properties": {}, } ) ``` The JSON schema for the tool's parameters. #### description ```python description: str | None = None ``` The description of the tool. #### outer_typed_dict_key ```python outer_typed_dict_key: str | None = None ``` The key in the outer [TypedDict] that wraps an output tool. This will only be set for output tools which don't have an `object` JSON schema. #### strict ```python strict: bool | None = None ``` Whether to enforce (vendor-specific) strict JSON schema validation for tool calls. Setting this to `True` while using a supported model generally imposes some restrictions on the tool's JSON schema in exchange for guaranteeing the API responses strictly match that schema. When `False`, the model may be free to generate other properties or types (depending on the vendor). When `None` (the default), the value will be inferred based on the compatibility of the parameters_json_schema. Note: this is currently only supported by OpenAI models. #### sequential ```python sequential: bool = False ``` Whether this tool requires a sequential/serial execution environment. #### kind ```python kind: ToolKind = field(default='function') ``` The kind of tool: - `'function'`: a tool that will be executed by Pydantic AI during an agent run and has its result returned to the model - `'output'`: a tool that passes through an output value that ends the run - `'external'`: a tool whose result will be produced outside of the Pydantic AI agent run in which it was called, because it depends on an upstream service (or user) or could take longer to generate than it's reasonable to keep the agent process running. See the [tools documentation](../../deferred-tools/#deferred-tools) for more info. - `'unapproved'`: a tool that requires human-in-the-loop approval. See the [tools documentation](../../deferred-tools/#human-in-the-loop-tool-approval) for more info. #### defer ```python defer: bool ``` Whether calls to this tool will be deferred. See the [tools documentation](../../deferred-tools/#deferred-tools) for more info. # `pydantic_ai.toolsets` ### AbstractToolset Bases: `ABC`, `Generic[AgentDepsT]` A toolset is a collection of tools that can be used by an agent. It is responsible for: - Listing the tools it contains - Validating the arguments of the tools - Calling the tools See [toolset docs](../../toolsets/) for more information. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/abstract.py` ```python class AbstractToolset(ABC, Generic[AgentDepsT]): """A toolset is a collection of tools that can be used by an agent. It is responsible for: - Listing the tools it contains - Validating the arguments of the tools - Calling the tools See [toolset docs](../toolsets.md) for more information. """ @property @abstractmethod def id(self) -> str | None: """An ID for the toolset that is unique among all toolsets registered with the same agent. If you're implementing a concrete implementation that users can instantiate more than once, you should let them optionally pass a custom ID to the constructor and return that here. A toolset needs to have an ID in order to be used in a durable execution environment like Temporal, in which case the ID will be used to identify the toolset's activities within the workflow. """ raise NotImplementedError() @property def label(self) -> str: """The name of the toolset for use in error messages.""" label = self.__class__.__name__ if self.id: # pragma: no branch label += f' {self.id!r}' return label @property def tool_name_conflict_hint(self) -> str: """A hint for how to avoid name conflicts with other toolsets for use in error messages.""" return 'Rename the tool or wrap the toolset in a `PrefixedToolset` to avoid name conflicts.' async def __aenter__(self) -> Self: """Enter the toolset context. This is where you can set up network connections in a concrete implementation. """ return self async def __aexit__(self, *args: Any) -> bool | None: """Exit the toolset context. This is where you can tear down network connections in a concrete implementation. """ return None @abstractmethod async def get_tools(self, ctx: RunContext[AgentDepsT]) -> dict[str, ToolsetTool[AgentDepsT]]: """The tools that are available in this toolset.""" raise NotImplementedError() @abstractmethod async def call_tool( self, name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT] ) -> Any: """Call a tool with the given arguments. Args: name: The name of the tool to call. tool_args: The arguments to pass to the tool. ctx: The run context. tool: The tool definition returned by [`get_tools`][pydantic_ai.toolsets.AbstractToolset.get_tools] that was called. """ raise NotImplementedError() def apply(self, visitor: Callable[[AbstractToolset[AgentDepsT]], None]) -> None: """Run a visitor function on all "leaf" toolsets (i.e. those that implement their own tool listing and calling).""" visitor(self) def visit_and_replace( self, visitor: Callable[[AbstractToolset[AgentDepsT]], AbstractToolset[AgentDepsT]] ) -> AbstractToolset[AgentDepsT]: """Run a visitor function on all "leaf" toolsets (i.e. those that implement their own tool listing and calling) and replace them in the hierarchy with the result of the function.""" return visitor(self) def filtered( self, filter_func: Callable[[RunContext[AgentDepsT], ToolDefinition], bool] ) -> FilteredToolset[AgentDepsT]: """Returns a new toolset that filters this toolset's tools using a filter function that takes the agent context and the tool definition. See [toolset docs](../toolsets.md#filtering-tools) for more information. """ from .filtered import FilteredToolset return FilteredToolset(self, filter_func) def prefixed(self, prefix: str) -> PrefixedToolset[AgentDepsT]: """Returns a new toolset that prefixes the names of this toolset's tools. See [toolset docs](../toolsets.md#prefixing-tool-names) for more information. """ from .prefixed import PrefixedToolset return PrefixedToolset(self, prefix) def prepared(self, prepare_func: ToolsPrepareFunc[AgentDepsT]) -> PreparedToolset[AgentDepsT]: """Returns a new toolset that prepares this toolset's tools using a prepare function that takes the agent context and the original tool definitions. See [toolset docs](../toolsets.md#preparing-tool-definitions) for more information. """ from .prepared import PreparedToolset return PreparedToolset(self, prepare_func) def renamed(self, name_map: dict[str, str]) -> RenamedToolset[AgentDepsT]: """Returns a new toolset that renames this toolset's tools using a dictionary mapping new names to original names. See [toolset docs](../toolsets.md#renaming-tools) for more information. """ from .renamed import RenamedToolset return RenamedToolset(self, name_map) def approval_required( self, approval_required_func: Callable[[RunContext[AgentDepsT], ToolDefinition, dict[str, Any]], bool] = ( lambda ctx, tool_def, tool_args: True ), ) -> ApprovalRequiredToolset[AgentDepsT]: """Returns a new toolset that requires (some) calls to tools it contains to be approved. See [toolset docs](../toolsets.md#requiring-tool-approval) for more information. """ from .approval_required import ApprovalRequiredToolset return ApprovalRequiredToolset(self, approval_required_func) ``` #### id ```python id: str | None ``` An ID for the toolset that is unique among all toolsets registered with the same agent. If you're implementing a concrete implementation that users can instantiate more than once, you should let them optionally pass a custom ID to the constructor and return that here. A toolset needs to have an ID in order to be used in a durable execution environment like Temporal, in which case the ID will be used to identify the toolset's activities within the workflow. #### label ```python label: str ``` The name of the toolset for use in error messages. #### tool_name_conflict_hint ```python tool_name_conflict_hint: str ``` A hint for how to avoid name conflicts with other toolsets for use in error messages. #### __aenter__ ```python __aenter__() -> Self ``` Enter the toolset context. This is where you can set up network connections in a concrete implementation. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/abstract.py` ```python async def __aenter__(self) -> Self: """Enter the toolset context. This is where you can set up network connections in a concrete implementation. """ return self ``` #### __aexit__ ```python __aexit__(*args: Any) -> bool | None ``` Exit the toolset context. This is where you can tear down network connections in a concrete implementation. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/abstract.py` ```python async def __aexit__(self, *args: Any) -> bool | None: """Exit the toolset context. This is where you can tear down network connections in a concrete implementation. """ return None ``` #### get_tools ```python get_tools( ctx: RunContext[AgentDepsT], ) -> dict[str, ToolsetTool[AgentDepsT]] ``` The tools that are available in this toolset. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/abstract.py` ```python @abstractmethod async def get_tools(self, ctx: RunContext[AgentDepsT]) -> dict[str, ToolsetTool[AgentDepsT]]: """The tools that are available in this toolset.""" raise NotImplementedError() ``` #### call_tool ```python call_tool( name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT], ) -> Any ``` Call a tool with the given arguments. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `name` | `str` | The name of the tool to call. | *required* | | `tool_args` | `dict[str, Any]` | The arguments to pass to the tool. | *required* | | `ctx` | `RunContext[AgentDepsT]` | The run context. | *required* | | `tool` | `ToolsetTool[AgentDepsT]` | The tool definition returned by get_tools that was called. | *required* | Source code in `pydantic_ai_slim/pydantic_ai/toolsets/abstract.py` ```python @abstractmethod async def call_tool( self, name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT] ) -> Any: """Call a tool with the given arguments. Args: name: The name of the tool to call. tool_args: The arguments to pass to the tool. ctx: The run context. tool: The tool definition returned by [`get_tools`][pydantic_ai.toolsets.AbstractToolset.get_tools] that was called. """ raise NotImplementedError() ``` #### apply ```python apply( visitor: Callable[[AbstractToolset[AgentDepsT]], None], ) -> None ``` Run a visitor function on all "leaf" toolsets (i.e. those that implement their own tool listing and calling). Source code in `pydantic_ai_slim/pydantic_ai/toolsets/abstract.py` ```python def apply(self, visitor: Callable[[AbstractToolset[AgentDepsT]], None]) -> None: """Run a visitor function on all "leaf" toolsets (i.e. those that implement their own tool listing and calling).""" visitor(self) ``` #### visit_and_replace ```python visit_and_replace( visitor: Callable[ [AbstractToolset[AgentDepsT]], AbstractToolset[AgentDepsT], ], ) -> AbstractToolset[AgentDepsT] ``` Run a visitor function on all "leaf" toolsets (i.e. those that implement their own tool listing and calling) and replace them in the hierarchy with the result of the function. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/abstract.py` ```python def visit_and_replace( self, visitor: Callable[[AbstractToolset[AgentDepsT]], AbstractToolset[AgentDepsT]] ) -> AbstractToolset[AgentDepsT]: """Run a visitor function on all "leaf" toolsets (i.e. those that implement their own tool listing and calling) and replace them in the hierarchy with the result of the function.""" return visitor(self) ``` #### filtered ```python filtered( filter_func: Callable[ [RunContext[AgentDepsT], ToolDefinition], bool ], ) -> FilteredToolset[AgentDepsT] ``` Returns a new toolset that filters this toolset's tools using a filter function that takes the agent context and the tool definition. See [toolset docs](../../toolsets/#filtering-tools) for more information. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/abstract.py` ```python def filtered( self, filter_func: Callable[[RunContext[AgentDepsT], ToolDefinition], bool] ) -> FilteredToolset[AgentDepsT]: """Returns a new toolset that filters this toolset's tools using a filter function that takes the agent context and the tool definition. See [toolset docs](../toolsets.md#filtering-tools) for more information. """ from .filtered import FilteredToolset return FilteredToolset(self, filter_func) ``` #### prefixed ```python prefixed(prefix: str) -> PrefixedToolset[AgentDepsT] ``` Returns a new toolset that prefixes the names of this toolset's tools. See [toolset docs](../../toolsets/#prefixing-tool-names) for more information. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/abstract.py` ```python def prefixed(self, prefix: str) -> PrefixedToolset[AgentDepsT]: """Returns a new toolset that prefixes the names of this toolset's tools. See [toolset docs](../toolsets.md#prefixing-tool-names) for more information. """ from .prefixed import PrefixedToolset return PrefixedToolset(self, prefix) ``` #### prepared ```python prepared( prepare_func: ToolsPrepareFunc[AgentDepsT], ) -> PreparedToolset[AgentDepsT] ``` Returns a new toolset that prepares this toolset's tools using a prepare function that takes the agent context and the original tool definitions. See [toolset docs](../../toolsets/#preparing-tool-definitions) for more information. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/abstract.py` ```python def prepared(self, prepare_func: ToolsPrepareFunc[AgentDepsT]) -> PreparedToolset[AgentDepsT]: """Returns a new toolset that prepares this toolset's tools using a prepare function that takes the agent context and the original tool definitions. See [toolset docs](../toolsets.md#preparing-tool-definitions) for more information. """ from .prepared import PreparedToolset return PreparedToolset(self, prepare_func) ``` #### renamed ```python renamed( name_map: dict[str, str], ) -> RenamedToolset[AgentDepsT] ``` Returns a new toolset that renames this toolset's tools using a dictionary mapping new names to original names. See [toolset docs](../../toolsets/#renaming-tools) for more information. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/abstract.py` ```python def renamed(self, name_map: dict[str, str]) -> RenamedToolset[AgentDepsT]: """Returns a new toolset that renames this toolset's tools using a dictionary mapping new names to original names. See [toolset docs](../toolsets.md#renaming-tools) for more information. """ from .renamed import RenamedToolset return RenamedToolset(self, name_map) ``` #### approval_required ```python approval_required( approval_required_func: Callable[ [ RunContext[AgentDepsT], ToolDefinition, dict[str, Any], ], bool, ] = lambda ctx, tool_def, tool_args: True ) -> ApprovalRequiredToolset[AgentDepsT] ``` Returns a new toolset that requires (some) calls to tools it contains to be approved. See [toolset docs](../../toolsets/#requiring-tool-approval) for more information. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/abstract.py` ```python def approval_required( self, approval_required_func: Callable[[RunContext[AgentDepsT], ToolDefinition, dict[str, Any]], bool] = ( lambda ctx, tool_def, tool_args: True ), ) -> ApprovalRequiredToolset[AgentDepsT]: """Returns a new toolset that requires (some) calls to tools it contains to be approved. See [toolset docs](../toolsets.md#requiring-tool-approval) for more information. """ from .approval_required import ApprovalRequiredToolset return ApprovalRequiredToolset(self, approval_required_func) ``` ### CombinedToolset Bases: `AbstractToolset[AgentDepsT]` A toolset that combines multiple toolsets. See [toolset docs](../../toolsets/#combining-toolsets) for more information. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/combined.py` ```python @dataclass class CombinedToolset(AbstractToolset[AgentDepsT]): """A toolset that combines multiple toolsets. See [toolset docs](../toolsets.md#combining-toolsets) for more information. """ toolsets: Sequence[AbstractToolset[AgentDepsT]] _enter_lock: Lock = field(compare=False, init=False, default_factory=Lock) _entered_count: int = field(init=False, default=0) _exit_stack: AsyncExitStack | None = field(init=False, default=None) @property def id(self) -> str | None: return None # pragma: no cover @property def label(self) -> str: return f'{self.__class__.__name__}({", ".join(toolset.label for toolset in self.toolsets)})' # pragma: no cover async def __aenter__(self) -> Self: async with self._enter_lock: if self._entered_count == 0: async with AsyncExitStack() as exit_stack: for toolset in self.toolsets: await exit_stack.enter_async_context(toolset) self._exit_stack = exit_stack.pop_all() self._entered_count += 1 return self async def __aexit__(self, *args: Any) -> bool | None: async with self._enter_lock: self._entered_count -= 1 if self._entered_count == 0 and self._exit_stack is not None: await self._exit_stack.aclose() self._exit_stack = None async def get_tools(self, ctx: RunContext[AgentDepsT]) -> dict[str, ToolsetTool[AgentDepsT]]: toolsets_tools = await asyncio.gather(*(toolset.get_tools(ctx) for toolset in self.toolsets)) all_tools: dict[str, ToolsetTool[AgentDepsT]] = {} for toolset, tools in zip(self.toolsets, toolsets_tools): for name, tool in tools.items(): tool_toolset = tool.toolset if existing_tool := all_tools.get(name): capitalized_toolset_label = tool_toolset.label[0].upper() + tool_toolset.label[1:] raise UserError( f'{capitalized_toolset_label} defines a tool whose name conflicts with existing tool from {existing_tool.toolset.label}: {name!r}. {toolset.tool_name_conflict_hint}' ) all_tools[name] = _CombinedToolsetTool( toolset=tool_toolset, tool_def=tool.tool_def, max_retries=tool.max_retries, args_validator=tool.args_validator, source_toolset=toolset, source_tool=tool, ) return all_tools async def call_tool( self, name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT] ) -> Any: assert isinstance(tool, _CombinedToolsetTool) return await tool.source_toolset.call_tool(name, tool_args, ctx, tool.source_tool) def apply(self, visitor: Callable[[AbstractToolset[AgentDepsT]], None]) -> None: for toolset in self.toolsets: toolset.apply(visitor) def visit_and_replace( self, visitor: Callable[[AbstractToolset[AgentDepsT]], AbstractToolset[AgentDepsT]] ) -> AbstractToolset[AgentDepsT]: return replace(self, toolsets=[toolset.visit_and_replace(visitor) for toolset in self.toolsets]) ``` ### ExternalToolset Bases: `AbstractToolset[AgentDepsT]` A toolset that holds tools whose results will be produced outside of the Pydantic AI agent run in which they were called. See [toolset docs](../../toolsets/#external-toolset) for more information. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/external.py` ```python class ExternalToolset(AbstractToolset[AgentDepsT]): """A toolset that holds tools whose results will be produced outside of the Pydantic AI agent run in which they were called. See [toolset docs](../toolsets.md#external-toolset) for more information. """ tool_defs: list[ToolDefinition] _id: str | None def __init__(self, tool_defs: list[ToolDefinition], *, id: str | None = None): self.tool_defs = tool_defs self._id = id @property def id(self) -> str | None: return self._id async def get_tools(self, ctx: RunContext[AgentDepsT]) -> dict[str, ToolsetTool[AgentDepsT]]: return { tool_def.name: ToolsetTool( toolset=self, tool_def=replace(tool_def, kind='external'), max_retries=0, args_validator=TOOL_SCHEMA_VALIDATOR, ) for tool_def in self.tool_defs } async def call_tool( self, name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT] ) -> Any: raise NotImplementedError('External tools cannot be called directly') ``` ### ApprovalRequiredToolset Bases: `WrapperToolset[AgentDepsT]` A toolset that requires (some) calls to tools it contains to be approved. See [toolset docs](../../toolsets/#requiring-tool-approval) for more information. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/approval_required.py` ```python @dataclass class ApprovalRequiredToolset(WrapperToolset[AgentDepsT]): """A toolset that requires (some) calls to tools it contains to be approved. See [toolset docs](../toolsets.md#requiring-tool-approval) for more information. """ approval_required_func: Callable[[RunContext[AgentDepsT], ToolDefinition, dict[str, Any]], bool] = ( lambda ctx, tool_def, tool_args: True ) async def call_tool( self, name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT] ) -> Any: if not ctx.tool_call_approved and self.approval_required_func(ctx, tool.tool_def, tool_args): raise ApprovalRequired return await super().call_tool(name, tool_args, ctx, tool) ``` ### FilteredToolset Bases: `WrapperToolset[AgentDepsT]` A toolset that filters the tools it contains using a filter function that takes the agent context and the tool definition. See [toolset docs](../../toolsets/#filtering-tools) for more information. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/filtered.py` ```python @dataclass class FilteredToolset(WrapperToolset[AgentDepsT]): """A toolset that filters the tools it contains using a filter function that takes the agent context and the tool definition. See [toolset docs](../toolsets.md#filtering-tools) for more information. """ filter_func: Callable[[RunContext[AgentDepsT], ToolDefinition], bool] async def get_tools(self, ctx: RunContext[AgentDepsT]) -> dict[str, ToolsetTool[AgentDepsT]]: return { name: tool for name, tool in (await super().get_tools(ctx)).items() if self.filter_func(ctx, tool.tool_def) } ``` ### FunctionToolset Bases: `AbstractToolset[AgentDepsT]` A toolset that lets Python functions be used as tools. See [toolset docs](../../toolsets/#function-toolset) for more information. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/function.py` ````python class FunctionToolset(AbstractToolset[AgentDepsT]): """A toolset that lets Python functions be used as tools. See [toolset docs](../toolsets.md#function-toolset) for more information. """ tools: dict[str, Tool[Any]] max_retries: int _id: str | None docstring_format: DocstringFormat require_parameter_descriptions: bool schema_generator: type[GenerateJsonSchema] def __init__( self, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] = [], *, max_retries: int = 1, id: str | None = None, docstring_format: DocstringFormat = 'auto', require_parameter_descriptions: bool = False, schema_generator: type[GenerateJsonSchema] = GenerateToolJsonSchema, ): """Build a new function toolset. Args: tools: The tools to add to the toolset. max_retries: The maximum number of retries for each tool during a run. id: An optional unique ID for the toolset. A toolset needs to have an ID in order to be used in a durable execution environment like Temporal, in which case the ID will be used to identify the toolset's activities within the workflow. docstring_format: Format of tool docstring, see [`DocstringFormat`][pydantic_ai.tools.DocstringFormat]. Defaults to `'auto'`, such that the format is inferred from the structure of the docstring. Applies to all tools, unless overridden when adding a tool. require_parameter_descriptions: If True, raise an error if a parameter description is missing. Defaults to False. Applies to all tools, unless overridden when adding a tool. schema_generator: The JSON schema generator class to use for this tool. Defaults to `GenerateToolJsonSchema`. Applies to all tools, unless overridden when adding a tool. """ self.max_retries = max_retries self._id = id self.docstring_format = docstring_format self.require_parameter_descriptions = require_parameter_descriptions self.schema_generator = schema_generator self.tools = {} for tool in tools: if isinstance(tool, Tool): self.add_tool(tool) else: self.add_function(tool) @property def id(self) -> str | None: return self._id @overload def tool(self, func: ToolFuncEither[AgentDepsT, ToolParams], /) -> ToolFuncEither[AgentDepsT, ToolParams]: ... @overload def tool( self, /, *, name: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat | None = None, require_parameter_descriptions: bool | None = None, schema_generator: type[GenerateJsonSchema] | None = None, strict: bool | None = None, sequential: bool = False, requires_approval: bool = False, ) -> Callable[[ToolFuncEither[AgentDepsT, ToolParams]], ToolFuncEither[AgentDepsT, ToolParams]]: ... def tool( self, func: ToolFuncEither[AgentDepsT, ToolParams] | None = None, /, *, name: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat | None = None, require_parameter_descriptions: bool | None = None, schema_generator: type[GenerateJsonSchema] | None = None, strict: bool | None = None, sequential: bool = False, requires_approval: bool = False, ) -> Any: """Decorator to register a tool function which takes [`RunContext`][pydantic_ai.tools.RunContext] as its first argument. Can decorate a sync or async functions. The docstring is inspected to extract both the tool description and description of each parameter, [learn more](../tools.md#function-tools-and-schema). We can't add overloads for every possible signature of tool, since the return type is a recursive union so the signature of functions decorated with `@toolset.tool` is obscured. Example: ```python from pydantic_ai import Agent, RunContext from pydantic_ai.toolsets.function import FunctionToolset toolset = FunctionToolset() @toolset.tool def foobar(ctx: RunContext[int], x: int) -> int: return ctx.deps + x @toolset.tool(retries=2) async def spam(ctx: RunContext[str], y: float) -> float: return ctx.deps + y agent = Agent('test', toolsets=[toolset], deps_type=int) result = agent.run_sync('foobar', deps=1) print(result.output) #> {"foobar":1,"spam":1.0} ``` Args: func: The tool function to register. name: The name of the tool, defaults to the function name. retries: The number of retries to allow for this tool, defaults to the agent's default retries, which defaults to 1. prepare: custom method to prepare the tool definition for each step, return `None` to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See [`ToolPrepareFunc`][pydantic_ai.tools.ToolPrepareFunc]. docstring_format: The format of the docstring, see [`DocstringFormat`][pydantic_ai.tools.DocstringFormat]. If `None`, the default value is determined by the toolset. require_parameter_descriptions: If True, raise an error if a parameter description is missing. If `None`, the default value is determined by the toolset. schema_generator: The JSON schema generator class to use for this tool. If `None`, the default value is determined by the toolset. strict: Whether to enforce JSON schema compliance (only affects OpenAI). See [`ToolDefinition`][pydantic_ai.tools.ToolDefinition] for more info. sequential: Whether the function requires a sequential/serial execution environment. Defaults to False. requires_approval: Whether this tool requires human-in-the-loop approval. Defaults to False. See the [tools documentation](../deferred-tools.md#human-in-the-loop-tool-approval) for more info. """ def tool_decorator( func_: ToolFuncEither[AgentDepsT, ToolParams], ) -> ToolFuncEither[AgentDepsT, ToolParams]: # noinspection PyTypeChecker self.add_function( func_, None, name, retries, prepare, docstring_format, require_parameter_descriptions, schema_generator, strict, sequential, requires_approval, ) return func_ return tool_decorator if func is None else tool_decorator(func) def add_function( self, func: ToolFuncEither[AgentDepsT, ToolParams], takes_ctx: bool | None = None, name: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat | None = None, require_parameter_descriptions: bool | None = None, schema_generator: type[GenerateJsonSchema] | None = None, strict: bool | None = None, sequential: bool = False, requires_approval: bool = False, ) -> None: """Add a function as a tool to the toolset. Can take a sync or async function. The docstring is inspected to extract both the tool description and description of each parameter, [learn more](../tools.md#function-tools-and-schema). Args: func: The tool function to register. takes_ctx: Whether the function takes a [`RunContext`][pydantic_ai.tools.RunContext] as its first argument. If `None`, this is inferred from the function signature. name: The name of the tool, defaults to the function name. retries: The number of retries to allow for this tool, defaults to the agent's default retries, which defaults to 1. prepare: custom method to prepare the tool definition for each step, return `None` to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See [`ToolPrepareFunc`][pydantic_ai.tools.ToolPrepareFunc]. docstring_format: The format of the docstring, see [`DocstringFormat`][pydantic_ai.tools.DocstringFormat]. If `None`, the default value is determined by the toolset. require_parameter_descriptions: If True, raise an error if a parameter description is missing. If `None`, the default value is determined by the toolset. schema_generator: The JSON schema generator class to use for this tool. If `None`, the default value is determined by the toolset. strict: Whether to enforce JSON schema compliance (only affects OpenAI). See [`ToolDefinition`][pydantic_ai.tools.ToolDefinition] for more info. sequential: Whether the function requires a sequential/serial execution environment. Defaults to False. requires_approval: Whether this tool requires human-in-the-loop approval. Defaults to False. See the [tools documentation](../deferred-tools.md#human-in-the-loop-tool-approval) for more info. """ if docstring_format is None: docstring_format = self.docstring_format if require_parameter_descriptions is None: require_parameter_descriptions = self.require_parameter_descriptions if schema_generator is None: schema_generator = self.schema_generator tool = Tool[AgentDepsT]( func, takes_ctx=takes_ctx, name=name, max_retries=retries, prepare=prepare, docstring_format=docstring_format, require_parameter_descriptions=require_parameter_descriptions, schema_generator=schema_generator, strict=strict, sequential=sequential, requires_approval=requires_approval, ) self.add_tool(tool) def add_tool(self, tool: Tool[AgentDepsT]) -> None: """Add a tool to the toolset. Args: tool: The tool to add. """ if tool.name in self.tools: raise UserError(f'Tool name conflicts with existing tool: {tool.name!r}') if tool.max_retries is None: tool.max_retries = self.max_retries self.tools[tool.name] = tool async def get_tools(self, ctx: RunContext[AgentDepsT]) -> dict[str, ToolsetTool[AgentDepsT]]: tools: dict[str, ToolsetTool[AgentDepsT]] = {} for original_name, tool in self.tools.items(): run_context = replace(ctx, tool_name=original_name, retry=ctx.retries.get(original_name, 0)) tool_def = await tool.prepare_tool_def(run_context) if not tool_def: continue new_name = tool_def.name if new_name in tools: if new_name != original_name: raise UserError(f'Renaming tool {original_name!r} to {new_name!r} conflicts with existing tool.') else: raise UserError(f'Tool name conflicts with previously renamed tool: {new_name!r}.') tools[new_name] = FunctionToolsetTool( toolset=self, tool_def=tool_def, max_retries=tool.max_retries if tool.max_retries is not None else self.max_retries, args_validator=tool.function_schema.validator, call_func=tool.function_schema.call, is_async=tool.function_schema.is_async, ) return tools async def call_tool( self, name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT] ) -> Any: assert isinstance(tool, FunctionToolsetTool) return await tool.call_func(tool_args, ctx) ```` #### __init__ ```python __init__( tools: Sequence[ Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...] ] = [], *, max_retries: int = 1, id: str | None = None, docstring_format: DocstringFormat = "auto", require_parameter_descriptions: bool = False, schema_generator: type[ GenerateJsonSchema ] = GenerateToolJsonSchema ) ``` Build a new function toolset. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `tools` | `Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]]` | The tools to add to the toolset. | `[]` | | `max_retries` | `int` | The maximum number of retries for each tool during a run. | `1` | | `id` | `str | None` | An optional unique ID for the toolset. A toolset needs to have an ID in order to be used in a durable execution environment like Temporal, in which case the ID will be used to identify the toolset's activities within the workflow. | `None` | | `docstring_format` | `DocstringFormat` | Format of tool docstring, see DocstringFormat. Defaults to 'auto', such that the format is inferred from the structure of the docstring. Applies to all tools, unless overridden when adding a tool. | `'auto'` | | `require_parameter_descriptions` | `bool` | If True, raise an error if a parameter description is missing. Defaults to False. Applies to all tools, unless overridden when adding a tool. | `False` | | `schema_generator` | `type[GenerateJsonSchema]` | The JSON schema generator class to use for this tool. Defaults to GenerateToolJsonSchema. Applies to all tools, unless overridden when adding a tool. | `GenerateToolJsonSchema` | Source code in `pydantic_ai_slim/pydantic_ai/toolsets/function.py` ```python def __init__( self, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] = [], *, max_retries: int = 1, id: str | None = None, docstring_format: DocstringFormat = 'auto', require_parameter_descriptions: bool = False, schema_generator: type[GenerateJsonSchema] = GenerateToolJsonSchema, ): """Build a new function toolset. Args: tools: The tools to add to the toolset. max_retries: The maximum number of retries for each tool during a run. id: An optional unique ID for the toolset. A toolset needs to have an ID in order to be used in a durable execution environment like Temporal, in which case the ID will be used to identify the toolset's activities within the workflow. docstring_format: Format of tool docstring, see [`DocstringFormat`][pydantic_ai.tools.DocstringFormat]. Defaults to `'auto'`, such that the format is inferred from the structure of the docstring. Applies to all tools, unless overridden when adding a tool. require_parameter_descriptions: If True, raise an error if a parameter description is missing. Defaults to False. Applies to all tools, unless overridden when adding a tool. schema_generator: The JSON schema generator class to use for this tool. Defaults to `GenerateToolJsonSchema`. Applies to all tools, unless overridden when adding a tool. """ self.max_retries = max_retries self._id = id self.docstring_format = docstring_format self.require_parameter_descriptions = require_parameter_descriptions self.schema_generator = schema_generator self.tools = {} for tool in tools: if isinstance(tool, Tool): self.add_tool(tool) else: self.add_function(tool) ``` #### tool ```python tool( func: ToolFuncEither[AgentDepsT, ToolParams], ) -> ToolFuncEither[AgentDepsT, ToolParams] ``` ```python tool( *, name: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat | None = None, require_parameter_descriptions: bool | None = None, schema_generator: ( type[GenerateJsonSchema] | None ) = None, strict: bool | None = None, sequential: bool = False, requires_approval: bool = False ) -> Callable[ [ToolFuncEither[AgentDepsT, ToolParams]], ToolFuncEither[AgentDepsT, ToolParams], ] ``` ```python tool( func: ( ToolFuncEither[AgentDepsT, ToolParams] | None ) = None, /, *, name: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat | None = None, require_parameter_descriptions: bool | None = None, schema_generator: ( type[GenerateJsonSchema] | None ) = None, strict: bool | None = None, sequential: bool = False, requires_approval: bool = False, ) -> Any ``` Decorator to register a tool function which takes RunContext as its first argument. Can decorate a sync or async functions. The docstring is inspected to extract both the tool description and description of each parameter, [learn more](../../tools/#function-tools-and-schema). We can't add overloads for every possible signature of tool, since the return type is a recursive union so the signature of functions decorated with `@toolset.tool` is obscured. Example: ```python from pydantic_ai import Agent, RunContext from pydantic_ai.toolsets.function import FunctionToolset toolset = FunctionToolset() @toolset.tool def foobar(ctx: RunContext[int], x: int) -> int: return ctx.deps + x @toolset.tool(retries=2) async def spam(ctx: RunContext[str], y: float) -> float: return ctx.deps + y agent = Agent('test', toolsets=[toolset], deps_type=int) result = agent.run_sync('foobar', deps=1) print(result.output) #> {"foobar":1,"spam":1.0} ``` Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `func` | `ToolFuncEither[AgentDepsT, ToolParams] | None` | The tool function to register. | `None` | | `name` | `str | None` | The name of the tool, defaults to the function name. | `None` | | `retries` | `int | None` | The number of retries to allow for this tool, defaults to the agent's default retries, which defaults to 1. | `None` | | `prepare` | `ToolPrepareFunc[AgentDepsT] | None` | custom method to prepare the tool definition for each step, return None to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See ToolPrepareFunc. | `None` | | `docstring_format` | `DocstringFormat | None` | The format of the docstring, see DocstringFormat. If None, the default value is determined by the toolset. | `None` | | `require_parameter_descriptions` | `bool | None` | If True, raise an error if a parameter description is missing. If None, the default value is determined by the toolset. | `None` | | `schema_generator` | `type[GenerateJsonSchema] | None` | The JSON schema generator class to use for this tool. If None, the default value is determined by the toolset. | `None` | | `strict` | `bool | None` | Whether to enforce JSON schema compliance (only affects OpenAI). See ToolDefinition for more info. | `None` | | `sequential` | `bool` | Whether the function requires a sequential/serial execution environment. Defaults to False. | `False` | | `requires_approval` | `bool` | Whether this tool requires human-in-the-loop approval. Defaults to False. See the tools documentation for more info. | `False` | Source code in `pydantic_ai_slim/pydantic_ai/toolsets/function.py` ````python def tool( self, func: ToolFuncEither[AgentDepsT, ToolParams] | None = None, /, *, name: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat | None = None, require_parameter_descriptions: bool | None = None, schema_generator: type[GenerateJsonSchema] | None = None, strict: bool | None = None, sequential: bool = False, requires_approval: bool = False, ) -> Any: """Decorator to register a tool function which takes [`RunContext`][pydantic_ai.tools.RunContext] as its first argument. Can decorate a sync or async functions. The docstring is inspected to extract both the tool description and description of each parameter, [learn more](../tools.md#function-tools-and-schema). We can't add overloads for every possible signature of tool, since the return type is a recursive union so the signature of functions decorated with `@toolset.tool` is obscured. Example: ```python from pydantic_ai import Agent, RunContext from pydantic_ai.toolsets.function import FunctionToolset toolset = FunctionToolset() @toolset.tool def foobar(ctx: RunContext[int], x: int) -> int: return ctx.deps + x @toolset.tool(retries=2) async def spam(ctx: RunContext[str], y: float) -> float: return ctx.deps + y agent = Agent('test', toolsets=[toolset], deps_type=int) result = agent.run_sync('foobar', deps=1) print(result.output) #> {"foobar":1,"spam":1.0} ``` Args: func: The tool function to register. name: The name of the tool, defaults to the function name. retries: The number of retries to allow for this tool, defaults to the agent's default retries, which defaults to 1. prepare: custom method to prepare the tool definition for each step, return `None` to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See [`ToolPrepareFunc`][pydantic_ai.tools.ToolPrepareFunc]. docstring_format: The format of the docstring, see [`DocstringFormat`][pydantic_ai.tools.DocstringFormat]. If `None`, the default value is determined by the toolset. require_parameter_descriptions: If True, raise an error if a parameter description is missing. If `None`, the default value is determined by the toolset. schema_generator: The JSON schema generator class to use for this tool. If `None`, the default value is determined by the toolset. strict: Whether to enforce JSON schema compliance (only affects OpenAI). See [`ToolDefinition`][pydantic_ai.tools.ToolDefinition] for more info. sequential: Whether the function requires a sequential/serial execution environment. Defaults to False. requires_approval: Whether this tool requires human-in-the-loop approval. Defaults to False. See the [tools documentation](../deferred-tools.md#human-in-the-loop-tool-approval) for more info. """ def tool_decorator( func_: ToolFuncEither[AgentDepsT, ToolParams], ) -> ToolFuncEither[AgentDepsT, ToolParams]: # noinspection PyTypeChecker self.add_function( func_, None, name, retries, prepare, docstring_format, require_parameter_descriptions, schema_generator, strict, sequential, requires_approval, ) return func_ return tool_decorator if func is None else tool_decorator(func) ```` #### add_function ```python add_function( func: ToolFuncEither[AgentDepsT, ToolParams], takes_ctx: bool | None = None, name: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat | None = None, require_parameter_descriptions: bool | None = None, schema_generator: ( type[GenerateJsonSchema] | None ) = None, strict: bool | None = None, sequential: bool = False, requires_approval: bool = False, ) -> None ``` Add a function as a tool to the toolset. Can take a sync or async function. The docstring is inspected to extract both the tool description and description of each parameter, [learn more](../../tools/#function-tools-and-schema). Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `func` | `ToolFuncEither[AgentDepsT, ToolParams]` | The tool function to register. | *required* | | `takes_ctx` | `bool | None` | Whether the function takes a RunContext as its first argument. If None, this is inferred from the function signature. | `None` | | `name` | `str | None` | The name of the tool, defaults to the function name. | `None` | | `retries` | `int | None` | The number of retries to allow for this tool, defaults to the agent's default retries, which defaults to 1. | `None` | | `prepare` | `ToolPrepareFunc[AgentDepsT] | None` | custom method to prepare the tool definition for each step, return None to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See ToolPrepareFunc. | `None` | | `docstring_format` | `DocstringFormat | None` | The format of the docstring, see DocstringFormat. If None, the default value is determined by the toolset. | `None` | | `require_parameter_descriptions` | `bool | None` | If True, raise an error if a parameter description is missing. If None, the default value is determined by the toolset. | `None` | | `schema_generator` | `type[GenerateJsonSchema] | None` | The JSON schema generator class to use for this tool. If None, the default value is determined by the toolset. | `None` | | `strict` | `bool | None` | Whether to enforce JSON schema compliance (only affects OpenAI). See ToolDefinition for more info. | `None` | | `sequential` | `bool` | Whether the function requires a sequential/serial execution environment. Defaults to False. | `False` | | `requires_approval` | `bool` | Whether this tool requires human-in-the-loop approval. Defaults to False. See the tools documentation for more info. | `False` | Source code in `pydantic_ai_slim/pydantic_ai/toolsets/function.py` ```python def add_function( self, func: ToolFuncEither[AgentDepsT, ToolParams], takes_ctx: bool | None = None, name: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat | None = None, require_parameter_descriptions: bool | None = None, schema_generator: type[GenerateJsonSchema] | None = None, strict: bool | None = None, sequential: bool = False, requires_approval: bool = False, ) -> None: """Add a function as a tool to the toolset. Can take a sync or async function. The docstring is inspected to extract both the tool description and description of each parameter, [learn more](../tools.md#function-tools-and-schema). Args: func: The tool function to register. takes_ctx: Whether the function takes a [`RunContext`][pydantic_ai.tools.RunContext] as its first argument. If `None`, this is inferred from the function signature. name: The name of the tool, defaults to the function name. retries: The number of retries to allow for this tool, defaults to the agent's default retries, which defaults to 1. prepare: custom method to prepare the tool definition for each step, return `None` to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See [`ToolPrepareFunc`][pydantic_ai.tools.ToolPrepareFunc]. docstring_format: The format of the docstring, see [`DocstringFormat`][pydantic_ai.tools.DocstringFormat]. If `None`, the default value is determined by the toolset. require_parameter_descriptions: If True, raise an error if a parameter description is missing. If `None`, the default value is determined by the toolset. schema_generator: The JSON schema generator class to use for this tool. If `None`, the default value is determined by the toolset. strict: Whether to enforce JSON schema compliance (only affects OpenAI). See [`ToolDefinition`][pydantic_ai.tools.ToolDefinition] for more info. sequential: Whether the function requires a sequential/serial execution environment. Defaults to False. requires_approval: Whether this tool requires human-in-the-loop approval. Defaults to False. See the [tools documentation](../deferred-tools.md#human-in-the-loop-tool-approval) for more info. """ if docstring_format is None: docstring_format = self.docstring_format if require_parameter_descriptions is None: require_parameter_descriptions = self.require_parameter_descriptions if schema_generator is None: schema_generator = self.schema_generator tool = Tool[AgentDepsT]( func, takes_ctx=takes_ctx, name=name, max_retries=retries, prepare=prepare, docstring_format=docstring_format, require_parameter_descriptions=require_parameter_descriptions, schema_generator=schema_generator, strict=strict, sequential=sequential, requires_approval=requires_approval, ) self.add_tool(tool) ``` #### add_tool ```python add_tool(tool: Tool[AgentDepsT]) -> None ``` Add a tool to the toolset. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `tool` | `Tool[AgentDepsT]` | The tool to add. | *required* | Source code in `pydantic_ai_slim/pydantic_ai/toolsets/function.py` ```python def add_tool(self, tool: Tool[AgentDepsT]) -> None: """Add a tool to the toolset. Args: tool: The tool to add. """ if tool.name in self.tools: raise UserError(f'Tool name conflicts with existing tool: {tool.name!r}') if tool.max_retries is None: tool.max_retries = self.max_retries self.tools[tool.name] = tool ``` ### PrefixedToolset Bases: `WrapperToolset[AgentDepsT]` A toolset that prefixes the names of the tools it contains. See [toolset docs](../../toolsets/#prefixing-tool-names) for more information. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/prefixed.py` ```python @dataclass class PrefixedToolset(WrapperToolset[AgentDepsT]): """A toolset that prefixes the names of the tools it contains. See [toolset docs](../toolsets.md#prefixing-tool-names) for more information. """ prefix: str @property def tool_name_conflict_hint(self) -> str: return 'Change the `prefix` attribute to avoid name conflicts.' async def get_tools(self, ctx: RunContext[AgentDepsT]) -> dict[str, ToolsetTool[AgentDepsT]]: return { new_name: replace( tool, toolset=self, tool_def=replace(tool.tool_def, name=new_name), ) for name, tool in (await super().get_tools(ctx)).items() if (new_name := f'{self.prefix}_{name}') } async def call_tool( self, name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT] ) -> Any: original_name = name.removeprefix(self.prefix + '_') ctx = replace(ctx, tool_name=original_name) tool = replace(tool, tool_def=replace(tool.tool_def, name=original_name)) return await super().call_tool(original_name, tool_args, ctx, tool) ``` ### RenamedToolset Bases: `WrapperToolset[AgentDepsT]` A toolset that renames the tools it contains using a dictionary mapping new names to original names. See [toolset docs](../../toolsets/#renaming-tools) for more information. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/renamed.py` ```python @dataclass class RenamedToolset(WrapperToolset[AgentDepsT]): """A toolset that renames the tools it contains using a dictionary mapping new names to original names. See [toolset docs](../toolsets.md#renaming-tools) for more information. """ name_map: dict[str, str] async def get_tools(self, ctx: RunContext[AgentDepsT]) -> dict[str, ToolsetTool[AgentDepsT]]: original_to_new_name_map = {v: k for k, v in self.name_map.items()} original_tools = await super().get_tools(ctx) tools: dict[str, ToolsetTool[AgentDepsT]] = {} for original_name, tool in original_tools.items(): new_name = original_to_new_name_map.get(original_name, None) if new_name: tools[new_name] = replace( tool, toolset=self, tool_def=replace(tool.tool_def, name=new_name), ) else: tools[original_name] = tool return tools async def call_tool( self, name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT] ) -> Any: original_name = self.name_map.get(name, name) ctx = replace(ctx, tool_name=original_name) tool = replace(tool, tool_def=replace(tool.tool_def, name=original_name)) return await super().call_tool(original_name, tool_args, ctx, tool) ``` ### PreparedToolset Bases: `WrapperToolset[AgentDepsT]` A toolset that prepares the tools it contains using a prepare function that takes the agent context and the original tool definitions. See [toolset docs](../../toolsets/#preparing-tool-definitions) for more information. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/prepared.py` ```python @dataclass class PreparedToolset(WrapperToolset[AgentDepsT]): """A toolset that prepares the tools it contains using a prepare function that takes the agent context and the original tool definitions. See [toolset docs](../toolsets.md#preparing-tool-definitions) for more information. """ prepare_func: ToolsPrepareFunc[AgentDepsT] async def get_tools(self, ctx: RunContext[AgentDepsT]) -> dict[str, ToolsetTool[AgentDepsT]]: original_tools = await super().get_tools(ctx) original_tool_defs = [tool.tool_def for tool in original_tools.values()] prepared_tool_defs_by_name = { tool_def.name: tool_def for tool_def in (await self.prepare_func(ctx, original_tool_defs) or []) } if len(prepared_tool_defs_by_name.keys() - original_tools.keys()) > 0: raise UserError( 'Prepare function cannot add or rename tools. Use `FunctionToolset.add_function()` or `RenamedToolset` instead.' ) return { name: replace(original_tools[name], tool_def=tool_def) for name, tool_def in prepared_tool_defs_by_name.items() } ``` ### WrapperToolset Bases: `AbstractToolset[AgentDepsT]` A toolset that wraps another toolset and delegates to it. See [toolset docs](../../toolsets/#wrapping-a-toolset) for more information. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/wrapper.py` ```python @dataclass class WrapperToolset(AbstractToolset[AgentDepsT]): """A toolset that wraps another toolset and delegates to it. See [toolset docs](../toolsets.md#wrapping-a-toolset) for more information. """ wrapped: AbstractToolset[AgentDepsT] @property def id(self) -> str | None: return None # pragma: no cover @property def label(self) -> str: return f'{self.__class__.__name__}({self.wrapped.label})' async def __aenter__(self) -> Self: await self.wrapped.__aenter__() return self async def __aexit__(self, *args: Any) -> bool | None: return await self.wrapped.__aexit__(*args) async def get_tools(self, ctx: RunContext[AgentDepsT]) -> dict[str, ToolsetTool[AgentDepsT]]: return await self.wrapped.get_tools(ctx) async def call_tool( self, name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT] ) -> Any: return await self.wrapped.call_tool(name, tool_args, ctx, tool) def apply(self, visitor: Callable[[AbstractToolset[AgentDepsT]], None]) -> None: self.wrapped.apply(visitor) def visit_and_replace( self, visitor: Callable[[AbstractToolset[AgentDepsT]], AbstractToolset[AgentDepsT]] ) -> AbstractToolset[AgentDepsT]: return replace(self, wrapped=self.wrapped.visit_and_replace(visitor)) ``` ### ToolsetFunc ```python ToolsetFunc: TypeAlias = Callable[ [RunContext[AgentDepsT]], AbstractToolset[AgentDepsT] | None | Awaitable[AbstractToolset[AgentDepsT] | None], ] ``` A sync/async function which takes a run context and returns a toolset. # `pydantic_ai.usage` ### RequestUsage Bases: `UsageBase` LLM usage associated with a single request. This is an implementation of `genai_prices.types.AbstractUsage` so it can be used to calculate the price of the request using [genai-prices](https://github.com/pydantic/genai-prices). Source code in `pydantic_ai_slim/pydantic_ai/usage.py` ```python @dataclass(repr=False, kw_only=True) class RequestUsage(UsageBase): """LLM usage associated with a single request. This is an implementation of `genai_prices.types.AbstractUsage` so it can be used to calculate the price of the request using [genai-prices](https://github.com/pydantic/genai-prices). """ @property def requests(self): return 1 def incr(self, incr_usage: RequestUsage) -> None: """Increment the usage in place. Args: incr_usage: The usage to increment by. """ return _incr_usage_tokens(self, incr_usage) def __add__(self, other: RequestUsage) -> RequestUsage: """Add two RequestUsages together. This is provided so it's trivial to sum usage information from multiple parts of a response. **WARNING:** this CANNOT be used to sum multiple requests without breaking some pricing calculations. """ new_usage = copy(self) new_usage.incr(other) return new_usage ``` #### incr ```python incr(incr_usage: RequestUsage) -> None ``` Increment the usage in place. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `incr_usage` | `RequestUsage` | The usage to increment by. | *required* | Source code in `pydantic_ai_slim/pydantic_ai/usage.py` ```python def incr(self, incr_usage: RequestUsage) -> None: """Increment the usage in place. Args: incr_usage: The usage to increment by. """ return _incr_usage_tokens(self, incr_usage) ``` #### __add__ ```python __add__(other: RequestUsage) -> RequestUsage ``` Add two RequestUsages together. This is provided so it's trivial to sum usage information from multiple parts of a response. **WARNING:** this CANNOT be used to sum multiple requests without breaking some pricing calculations. Source code in `pydantic_ai_slim/pydantic_ai/usage.py` ```python def __add__(self, other: RequestUsage) -> RequestUsage: """Add two RequestUsages together. This is provided so it's trivial to sum usage information from multiple parts of a response. **WARNING:** this CANNOT be used to sum multiple requests without breaking some pricing calculations. """ new_usage = copy(self) new_usage.incr(other) return new_usage ``` ### RunUsage Bases: `UsageBase` LLM usage associated with an agent run. Responsibility for calculating request usage is on the model; Pydantic AI simply sums the usage information across requests. Source code in `pydantic_ai_slim/pydantic_ai/usage.py` ```python @dataclass(repr=False, kw_only=True) class RunUsage(UsageBase): """LLM usage associated with an agent run. Responsibility for calculating request usage is on the model; Pydantic AI simply sums the usage information across requests. """ requests: int = 0 """Number of requests made to the LLM API.""" tool_calls: int = 0 """Number of successful tool calls executed during the run.""" input_tokens: int = 0 """Total number of text input/prompt tokens.""" cache_write_tokens: int = 0 """Total number of tokens written to the cache.""" cache_read_tokens: int = 0 """Total number of tokens read from the cache.""" input_audio_tokens: int = 0 """Total number of audio input tokens.""" cache_audio_read_tokens: int = 0 """Total number of audio tokens read from the cache.""" output_tokens: int = 0 """Total number of text output/completion tokens.""" details: dict[str, int] = dataclasses.field(default_factory=dict) """Any extra details returned by the model.""" def incr(self, incr_usage: RunUsage | RequestUsage) -> None: """Increment the usage in place. Args: incr_usage: The usage to increment by. """ if isinstance(incr_usage, RunUsage): self.requests += incr_usage.requests self.tool_calls += incr_usage.tool_calls return _incr_usage_tokens(self, incr_usage) def __add__(self, other: RunUsage | RequestUsage) -> RunUsage: """Add two RunUsages together. This is provided so it's trivial to sum usage information from multiple runs. """ new_usage = copy(self) new_usage.incr(other) return new_usage ``` #### requests ```python requests: int = 0 ``` Number of requests made to the LLM API. #### tool_calls ```python tool_calls: int = 0 ``` Number of successful tool calls executed during the run. #### input_tokens ```python input_tokens: int = 0 ``` Total number of text input/prompt tokens. #### cache_write_tokens ```python cache_write_tokens: int = 0 ``` Total number of tokens written to the cache. #### cache_read_tokens ```python cache_read_tokens: int = 0 ``` Total number of tokens read from the cache. #### input_audio_tokens ```python input_audio_tokens: int = 0 ``` Total number of audio input tokens. #### cache_audio_read_tokens ```python cache_audio_read_tokens: int = 0 ``` Total number of audio tokens read from the cache. #### output_tokens ```python output_tokens: int = 0 ``` Total number of text output/completion tokens. #### details ```python details: dict[str, int] = field(default_factory=dict) ``` Any extra details returned by the model. #### incr ```python incr(incr_usage: RunUsage | RequestUsage) -> None ``` Increment the usage in place. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `incr_usage` | `RunUsage | RequestUsage` | The usage to increment by. | *required* | Source code in `pydantic_ai_slim/pydantic_ai/usage.py` ```python def incr(self, incr_usage: RunUsage | RequestUsage) -> None: """Increment the usage in place. Args: incr_usage: The usage to increment by. """ if isinstance(incr_usage, RunUsage): self.requests += incr_usage.requests self.tool_calls += incr_usage.tool_calls return _incr_usage_tokens(self, incr_usage) ``` #### __add__ ```python __add__(other: RunUsage | RequestUsage) -> RunUsage ``` Add two RunUsages together. This is provided so it's trivial to sum usage information from multiple runs. Source code in `pydantic_ai_slim/pydantic_ai/usage.py` ```python def __add__(self, other: RunUsage | RequestUsage) -> RunUsage: """Add two RunUsages together. This is provided so it's trivial to sum usage information from multiple runs. """ new_usage = copy(self) new_usage.incr(other) return new_usage ``` ### Usage Bases: `RunUsage` Deprecated `Usage` is deprecated, use `RunUsage` instead Deprecated alias for `RunUsage`. Source code in `pydantic_ai_slim/pydantic_ai/usage.py` ```python @dataclass(repr=False, kw_only=True) @deprecated('`Usage` is deprecated, use `RunUsage` instead') class Usage(RunUsage): """Deprecated alias for `RunUsage`.""" ``` ### UsageLimits Limits on model usage. The request count is tracked by pydantic_ai, and the request limit is checked before each request to the model. Token counts are provided in responses from the model, and the token limits are checked after each response. Each of the limits can be set to `None` to disable that limit. Source code in `pydantic_ai_slim/pydantic_ai/usage.py` ```python @dataclass(repr=False, kw_only=True) class UsageLimits: """Limits on model usage. The request count is tracked by pydantic_ai, and the request limit is checked before each request to the model. Token counts are provided in responses from the model, and the token limits are checked after each response. Each of the limits can be set to `None` to disable that limit. """ request_limit: int | None = 50 """The maximum number of requests allowed to the model.""" tool_calls_limit: int | None = None """The maximum number of successful tool calls allowed to be executed.""" input_tokens_limit: int | None = None """The maximum number of input/prompt tokens allowed.""" output_tokens_limit: int | None = None """The maximum number of output/response tokens allowed.""" total_tokens_limit: int | None = None """The maximum number of tokens allowed in requests and responses combined.""" count_tokens_before_request: bool = False """If True, perform a token counting pass before sending the request to the model, to enforce `request_tokens_limit` ahead of time. This may incur additional overhead (from calling the model's `count_tokens` API before making the actual request) and is disabled by default.""" @property @deprecated('`request_tokens_limit` is deprecated, use `input_tokens_limit` instead') def request_tokens_limit(self) -> int | None: return self.input_tokens_limit @property @deprecated('`response_tokens_limit` is deprecated, use `output_tokens_limit` instead') def response_tokens_limit(self) -> int | None: return self.output_tokens_limit @overload def __init__( self, *, request_limit: int | None = 50, tool_calls_limit: int | None = None, input_tokens_limit: int | None = None, output_tokens_limit: int | None = None, total_tokens_limit: int | None = None, count_tokens_before_request: bool = False, ) -> None: self.request_limit = request_limit self.tool_calls_limit = tool_calls_limit self.input_tokens_limit = input_tokens_limit self.output_tokens_limit = output_tokens_limit self.total_tokens_limit = total_tokens_limit self.count_tokens_before_request = count_tokens_before_request @overload @deprecated( 'Use `input_tokens_limit` instead of `request_tokens_limit` and `output_tokens_limit` and `total_tokens_limit`' ) def __init__( self, *, request_limit: int | None = 50, tool_calls_limit: int | None = None, request_tokens_limit: int | None = None, response_tokens_limit: int | None = None, total_tokens_limit: int | None = None, count_tokens_before_request: bool = False, ) -> None: self.request_limit = request_limit self.tool_calls_limit = tool_calls_limit self.input_tokens_limit = request_tokens_limit self.output_tokens_limit = response_tokens_limit self.total_tokens_limit = total_tokens_limit self.count_tokens_before_request = count_tokens_before_request def __init__( self, *, request_limit: int | None = 50, tool_calls_limit: int | None = None, input_tokens_limit: int | None = None, output_tokens_limit: int | None = None, total_tokens_limit: int | None = None, count_tokens_before_request: bool = False, # deprecated: request_tokens_limit: int | None = None, response_tokens_limit: int | None = None, ): self.request_limit = request_limit self.tool_calls_limit = tool_calls_limit self.input_tokens_limit = input_tokens_limit or request_tokens_limit self.output_tokens_limit = output_tokens_limit or response_tokens_limit self.total_tokens_limit = total_tokens_limit self.count_tokens_before_request = count_tokens_before_request def has_token_limits(self) -> bool: """Returns `True` if this instance places any limits on token counts. If this returns `False`, the `check_tokens` method will never raise an error. This is useful because if we have token limits, we need to check them after receiving each streamed message. If there are no limits, we can skip that processing in the streaming response iterator. """ return any( limit is not None for limit in (self.input_tokens_limit, self.output_tokens_limit, self.total_tokens_limit) ) def check_before_request(self, usage: RunUsage) -> None: """Raises a `UsageLimitExceeded` exception if the next request would exceed any of the limits.""" request_limit = self.request_limit if request_limit is not None and usage.requests >= request_limit: raise UsageLimitExceeded(f'The next request would exceed the request_limit of {request_limit}') input_tokens = usage.input_tokens if self.input_tokens_limit is not None and input_tokens > self.input_tokens_limit: raise UsageLimitExceeded( f'The next request would exceed the input_tokens_limit of {self.input_tokens_limit} ({input_tokens=})' ) total_tokens = usage.total_tokens if self.total_tokens_limit is not None and total_tokens > self.total_tokens_limit: raise UsageLimitExceeded( # pragma: lax no cover f'The next request would exceed the total_tokens_limit of {self.total_tokens_limit} ({total_tokens=})' ) def check_tokens(self, usage: RunUsage) -> None: """Raises a `UsageLimitExceeded` exception if the usage exceeds any of the token limits.""" input_tokens = usage.input_tokens if self.input_tokens_limit is not None and input_tokens > self.input_tokens_limit: raise UsageLimitExceeded(f'Exceeded the input_tokens_limit of {self.input_tokens_limit} ({input_tokens=})') output_tokens = usage.output_tokens if self.output_tokens_limit is not None and output_tokens > self.output_tokens_limit: raise UsageLimitExceeded( f'Exceeded the output_tokens_limit of {self.output_tokens_limit} ({output_tokens=})' ) total_tokens = usage.total_tokens if self.total_tokens_limit is not None and total_tokens > self.total_tokens_limit: raise UsageLimitExceeded(f'Exceeded the total_tokens_limit of {self.total_tokens_limit} ({total_tokens=})') def check_before_tool_call(self, usage: RunUsage) -> None: """Raises a `UsageLimitExceeded` exception if the next tool call would exceed the tool call limit.""" tool_calls_limit = self.tool_calls_limit if tool_calls_limit is not None and usage.tool_calls >= tool_calls_limit: raise UsageLimitExceeded( f'The next tool call would exceed the tool_calls_limit of {tool_calls_limit} (tool_calls={usage.tool_calls})' ) __repr__ = _utils.dataclasses_no_defaults_repr ``` #### request_limit ```python request_limit: int | None = request_limit ``` The maximum number of requests allowed to the model. #### tool_calls_limit ```python tool_calls_limit: int | None = tool_calls_limit ``` The maximum number of successful tool calls allowed to be executed. #### input_tokens_limit ```python input_tokens_limit: int | None = ( input_tokens_limit or request_tokens_limit ) ``` The maximum number of input/prompt tokens allowed. #### output_tokens_limit ```python output_tokens_limit: int | None = ( output_tokens_limit or response_tokens_limit ) ``` The maximum number of output/response tokens allowed. #### total_tokens_limit ```python total_tokens_limit: int | None = total_tokens_limit ``` The maximum number of tokens allowed in requests and responses combined. #### count_tokens_before_request ```python count_tokens_before_request: bool = ( count_tokens_before_request ) ``` If True, perform a token counting pass before sending the request to the model, to enforce `request_tokens_limit` ahead of time. This may incur additional overhead (from calling the model's `count_tokens` API before making the actual request) and is disabled by default. #### has_token_limits ```python has_token_limits() -> bool ``` Returns `True` if this instance places any limits on token counts. If this returns `False`, the `check_tokens` method will never raise an error. This is useful because if we have token limits, we need to check them after receiving each streamed message. If there are no limits, we can skip that processing in the streaming response iterator. Source code in `pydantic_ai_slim/pydantic_ai/usage.py` ```python def has_token_limits(self) -> bool: """Returns `True` if this instance places any limits on token counts. If this returns `False`, the `check_tokens` method will never raise an error. This is useful because if we have token limits, we need to check them after receiving each streamed message. If there are no limits, we can skip that processing in the streaming response iterator. """ return any( limit is not None for limit in (self.input_tokens_limit, self.output_tokens_limit, self.total_tokens_limit) ) ``` #### check_before_request ```python check_before_request(usage: RunUsage) -> None ``` Raises a `UsageLimitExceeded` exception if the next request would exceed any of the limits. Source code in `pydantic_ai_slim/pydantic_ai/usage.py` ```python def check_before_request(self, usage: RunUsage) -> None: """Raises a `UsageLimitExceeded` exception if the next request would exceed any of the limits.""" request_limit = self.request_limit if request_limit is not None and usage.requests >= request_limit: raise UsageLimitExceeded(f'The next request would exceed the request_limit of {request_limit}') input_tokens = usage.input_tokens if self.input_tokens_limit is not None and input_tokens > self.input_tokens_limit: raise UsageLimitExceeded( f'The next request would exceed the input_tokens_limit of {self.input_tokens_limit} ({input_tokens=})' ) total_tokens = usage.total_tokens if self.total_tokens_limit is not None and total_tokens > self.total_tokens_limit: raise UsageLimitExceeded( # pragma: lax no cover f'The next request would exceed the total_tokens_limit of {self.total_tokens_limit} ({total_tokens=})' ) ``` #### check_tokens ```python check_tokens(usage: RunUsage) -> None ``` Raises a `UsageLimitExceeded` exception if the usage exceeds any of the token limits. Source code in `pydantic_ai_slim/pydantic_ai/usage.py` ```python def check_tokens(self, usage: RunUsage) -> None: """Raises a `UsageLimitExceeded` exception if the usage exceeds any of the token limits.""" input_tokens = usage.input_tokens if self.input_tokens_limit is not None and input_tokens > self.input_tokens_limit: raise UsageLimitExceeded(f'Exceeded the input_tokens_limit of {self.input_tokens_limit} ({input_tokens=})') output_tokens = usage.output_tokens if self.output_tokens_limit is not None and output_tokens > self.output_tokens_limit: raise UsageLimitExceeded( f'Exceeded the output_tokens_limit of {self.output_tokens_limit} ({output_tokens=})' ) total_tokens = usage.total_tokens if self.total_tokens_limit is not None and total_tokens > self.total_tokens_limit: raise UsageLimitExceeded(f'Exceeded the total_tokens_limit of {self.total_tokens_limit} ({total_tokens=})') ``` #### check_before_tool_call ```python check_before_tool_call(usage: RunUsage) -> None ``` Raises a `UsageLimitExceeded` exception if the next tool call would exceed the tool call limit. Source code in `pydantic_ai_slim/pydantic_ai/usage.py` ```python def check_before_tool_call(self, usage: RunUsage) -> None: """Raises a `UsageLimitExceeded` exception if the next tool call would exceed the tool call limit.""" tool_calls_limit = self.tool_calls_limit if tool_calls_limit is not None and usage.tool_calls >= tool_calls_limit: raise UsageLimitExceeded( f'The next tool call would exceed the tool_calls_limit of {tool_calls_limit} (tool_calls={usage.tool_calls})' ) ``` # `pydantic_ai.models.anthropic` ## Setup For details on how to set up authentication with this model, see [model configuration for Anthropic](../../../models/anthropic/). ### LatestAnthropicModelNames ```python LatestAnthropicModelNames = ModelParam ``` Latest Anthropic models. ### AnthropicModelName ```python AnthropicModelName = str | LatestAnthropicModelNames ``` Possible Anthropic model names. Since Anthropic supports a variety of date-stamped models, we explicitly list the latest models but allow any name in the type hints. See [the Anthropic docs](https://docs.anthropic.com/en/docs/about-claude/models) for a full list. ### AnthropicModelSettings Bases: `ModelSettings` Settings used for an Anthropic model request. Source code in `pydantic_ai_slim/pydantic_ai/models/anthropic.py` ```python class AnthropicModelSettings(ModelSettings, total=False): """Settings used for an Anthropic model request.""" # ALL FIELDS MUST BE `anthropic_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. anthropic_metadata: BetaMetadataParam """An object describing metadata about the request. Contains `user_id`, an external identifier for the user who is associated with the request. """ anthropic_thinking: BetaThinkingConfigParam """Determine whether the model should generate a thinking block. See [the Anthropic docs](https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking) for more information. """ ``` #### anthropic_metadata ```python anthropic_metadata: BetaMetadataParam ``` An object describing metadata about the request. Contains `user_id`, an external identifier for the user who is associated with the request. #### anthropic_thinking ```python anthropic_thinking: BetaThinkingConfigParam ``` Determine whether the model should generate a thinking block. See [the Anthropic docs](https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking) for more information. ### AnthropicModel Bases: `Model` A model that uses the Anthropic API. Internally, this uses the [Anthropic Python client](https://github.com/anthropics/anthropic-sdk-python) to interact with the API. Apart from `__init__`, all methods are private or match those of the base class. Source code in `pydantic_ai_slim/pydantic_ai/models/anthropic.py` ```python @dataclass(init=False) class AnthropicModel(Model): """A model that uses the Anthropic API. Internally, this uses the [Anthropic Python client](https://github.com/anthropics/anthropic-sdk-python) to interact with the API. Apart from `__init__`, all methods are private or match those of the base class. """ client: AsyncAnthropicClient = field(repr=False) _model_name: AnthropicModelName = field(repr=False) _provider: Provider[AsyncAnthropicClient] = field(repr=False) def __init__( self, model_name: AnthropicModelName, *, provider: Literal['anthropic'] | Provider[AsyncAnthropicClient] = 'anthropic', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize an Anthropic model. Args: model_name: The name of the Anthropic model to use. List of model names available [here](https://docs.anthropic.com/en/docs/about-claude/models). provider: The provider to use for the Anthropic API. Can be either the string 'anthropic' or an instance of `Provider[AsyncAnthropicClient]`. If not provided, the other parameters will be used. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. settings: Default model settings for this model instance. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider(provider) self._provider = provider self.client = provider.client super().__init__(settings=settings, profile=profile or provider.model_profile) @property def base_url(self) -> str: return str(self.client.base_url) @property def model_name(self) -> AnthropicModelName: """The model name.""" return self._model_name @property def system(self) -> str: """The model provider.""" return self._provider.name async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: check_allow_model_requests() response = await self._messages_create( messages, False, cast(AnthropicModelSettings, model_settings or {}), model_request_parameters ) model_response = self._process_response(response) return model_response @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: check_allow_model_requests() response = await self._messages_create( messages, True, cast(AnthropicModelSettings, model_settings or {}), model_request_parameters ) async with response: yield await self._process_streamed_response(response, model_request_parameters) @overload async def _messages_create( self, messages: list[ModelMessage], stream: Literal[True], model_settings: AnthropicModelSettings, model_request_parameters: ModelRequestParameters, ) -> AsyncStream[BetaRawMessageStreamEvent]: pass @overload async def _messages_create( self, messages: list[ModelMessage], stream: Literal[False], model_settings: AnthropicModelSettings, model_request_parameters: ModelRequestParameters, ) -> BetaMessage: pass async def _messages_create( self, messages: list[ModelMessage], stream: bool, model_settings: AnthropicModelSettings, model_request_parameters: ModelRequestParameters, ) -> BetaMessage | AsyncStream[BetaRawMessageStreamEvent]: # standalone function to make it easier to override tools = self._get_tools(model_request_parameters) builtin_tools, tool_headers = self._get_builtin_tools(model_request_parameters) tools += builtin_tools tool_choice: BetaToolChoiceParam | None if not tools: tool_choice = None else: if not model_request_parameters.allow_text_output: tool_choice = {'type': 'any'} else: tool_choice = {'type': 'auto'} if (allow_parallel_tool_calls := model_settings.get('parallel_tool_calls')) is not None: tool_choice['disable_parallel_tool_use'] = not allow_parallel_tool_calls system_prompt, anthropic_messages = await self._map_message(messages) try: extra_headers = model_settings.get('extra_headers', {}) for k, v in tool_headers.items(): extra_headers.setdefault(k, v) extra_headers.setdefault('User-Agent', get_user_agent()) return await self.client.beta.messages.create( max_tokens=model_settings.get('max_tokens', 4096), system=system_prompt or NOT_GIVEN, messages=anthropic_messages, model=self._model_name, tools=tools or NOT_GIVEN, tool_choice=tool_choice or NOT_GIVEN, stream=stream, thinking=model_settings.get('anthropic_thinking', NOT_GIVEN), stop_sequences=model_settings.get('stop_sequences', NOT_GIVEN), temperature=model_settings.get('temperature', NOT_GIVEN), top_p=model_settings.get('top_p', NOT_GIVEN), timeout=model_settings.get('timeout', NOT_GIVEN), metadata=model_settings.get('anthropic_metadata', NOT_GIVEN), extra_headers=extra_headers, extra_body=model_settings.get('extra_body'), ) except APIStatusError as e: if (status_code := e.status_code) >= 400: raise ModelHTTPError(status_code=status_code, model_name=self.model_name, body=e.body) from e raise # pragma: lax no cover def _process_response(self, response: BetaMessage) -> ModelResponse: """Process a non-streamed response, and prepare a message to return.""" items: list[ModelResponsePart] = [] for item in response.content: if isinstance(item, BetaTextBlock): items.append(TextPart(content=item.text)) elif isinstance(item, BetaWebSearchToolResultBlock | BetaCodeExecutionToolResultBlock): items.append( BuiltinToolReturnPart( provider_name=self.system, tool_name=item.type, content=item.content, tool_call_id=item.tool_use_id, ) ) elif isinstance(item, BetaServerToolUseBlock): items.append( BuiltinToolCallPart( provider_name=self.system, tool_name=item.name, args=cast(dict[str, Any], item.input), tool_call_id=item.id, ) ) elif isinstance(item, BetaRedactedThinkingBlock): items.append( ThinkingPart(id='redacted_thinking', content='', signature=item.data, provider_name=self.system) ) elif isinstance(item, BetaThinkingBlock): items.append(ThinkingPart(content=item.thinking, signature=item.signature, provider_name=self.system)) else: assert isinstance(item, BetaToolUseBlock), f'unexpected item type {type(item)}' items.append( ToolCallPart( tool_name=item.name, args=cast(dict[str, Any], item.input), tool_call_id=item.id, ) ) finish_reason: FinishReason | None = None provider_details: dict[str, Any] | None = None if raw_finish_reason := response.stop_reason: # pragma: no branch provider_details = {'finish_reason': raw_finish_reason} finish_reason = _FINISH_REASON_MAP.get(raw_finish_reason) return ModelResponse( parts=items, usage=_map_usage(response), model_name=response.model, provider_response_id=response.id, provider_name=self._provider.name, finish_reason=finish_reason, provider_details=provider_details, ) async def _process_streamed_response( self, response: AsyncStream[BetaRawMessageStreamEvent], model_request_parameters: ModelRequestParameters ) -> StreamedResponse: peekable_response = _utils.PeekableAsyncStream(response) first_chunk = await peekable_response.peek() if isinstance(first_chunk, _utils.Unset): raise UnexpectedModelBehavior('Streamed response ended without content or tool calls') # pragma: no cover assert isinstance(first_chunk, BetaRawMessageStartEvent) return AnthropicStreamedResponse( model_request_parameters=model_request_parameters, _model_name=first_chunk.message.model, _response=peekable_response, _timestamp=_utils.now_utc(), _provider_name=self._provider.name, ) def _get_tools(self, model_request_parameters: ModelRequestParameters) -> list[BetaToolParam]: return [self._map_tool_definition(r) for r in model_request_parameters.tool_defs.values()] def _get_builtin_tools( self, model_request_parameters: ModelRequestParameters ) -> tuple[list[BetaToolUnionParam], dict[str, str]]: tools: list[BetaToolUnionParam] = [] extra_headers: dict[str, str] = {} for tool in model_request_parameters.builtin_tools: if isinstance(tool, WebSearchTool): user_location = UserLocation(type='approximate', **tool.user_location) if tool.user_location else None tools.append( BetaWebSearchTool20250305Param( name='web_search', type='web_search_20250305', max_uses=tool.max_uses, allowed_domains=tool.allowed_domains, blocked_domains=tool.blocked_domains, user_location=user_location, ) ) elif isinstance(tool, CodeExecutionTool): # pragma: no branch extra_headers['anthropic-beta'] = 'code-execution-2025-05-22' tools.append(BetaCodeExecutionTool20250522Param(name='code_execution', type='code_execution_20250522')) else: # pragma: no cover raise UserError( f'`{tool.__class__.__name__}` is not supported by `AnthropicModel`. If it should be, please file an issue.' ) return tools, extra_headers async def _map_message(self, messages: list[ModelMessage]) -> tuple[str, list[BetaMessageParam]]: # noqa: C901 """Just maps a `pydantic_ai.Message` to a `anthropic.types.MessageParam`.""" system_prompt_parts: list[str] = [] anthropic_messages: list[BetaMessageParam] = [] for m in messages: if isinstance(m, ModelRequest): user_content_params: list[BetaContentBlockParam] = [] for request_part in m.parts: if isinstance(request_part, SystemPromptPart): system_prompt_parts.append(request_part.content) elif isinstance(request_part, UserPromptPart): async for content in self._map_user_prompt(request_part): user_content_params.append(content) elif isinstance(request_part, ToolReturnPart): tool_result_block_param = BetaToolResultBlockParam( tool_use_id=_guard_tool_call_id(t=request_part), type='tool_result', content=request_part.model_response_str(), is_error=False, ) user_content_params.append(tool_result_block_param) elif isinstance(request_part, RetryPromptPart): # pragma: no branch if request_part.tool_name is None: text = request_part.model_response() # pragma: no cover retry_param = BetaTextBlockParam(type='text', text=text) # pragma: no cover else: retry_param = BetaToolResultBlockParam( tool_use_id=_guard_tool_call_id(t=request_part), type='tool_result', content=request_part.model_response(), is_error=True, ) user_content_params.append(retry_param) if len(user_content_params) > 0: anthropic_messages.append(BetaMessageParam(role='user', content=user_content_params)) elif isinstance(m, ModelResponse): assistant_content_params: list[ BetaTextBlockParam | BetaToolUseBlockParam | BetaServerToolUseBlockParam | BetaWebSearchToolResultBlockParam | BetaCodeExecutionToolResultBlockParam | BetaThinkingBlockParam | BetaRedactedThinkingBlockParam ] = [] for response_part in m.parts: if isinstance(response_part, TextPart): if response_part.content: assistant_content_params.append(BetaTextBlockParam(text=response_part.content, type='text')) elif isinstance(response_part, ToolCallPart): tool_use_block_param = BetaToolUseBlockParam( id=_guard_tool_call_id(t=response_part), type='tool_use', name=response_part.tool_name, input=response_part.args_as_dict(), ) assistant_content_params.append(tool_use_block_param) elif isinstance(response_part, ThinkingPart): if ( response_part.provider_name == self.system and response_part.signature is not None ): # pragma: no branch if response_part.id == 'redacted_thinking': assistant_content_params.append( BetaRedactedThinkingBlockParam( data=response_part.signature, type='redacted_thinking', ) ) else: assistant_content_params.append( BetaThinkingBlockParam( thinking=response_part.content, signature=response_part.signature, type='thinking', ) ) elif response_part.content: # pragma: no branch start_tag, end_tag = self.profile.thinking_tags assistant_content_params.append( BetaTextBlockParam( text='\n'.join([start_tag, response_part.content, end_tag]), type='text' ) ) elif isinstance(response_part, BuiltinToolCallPart): if response_part.provider_name == self.system: server_tool_use_block_param = BetaServerToolUseBlockParam( id=_guard_tool_call_id(t=response_part), type='server_tool_use', name=cast(Literal['web_search', 'code_execution'], response_part.tool_name), input=response_part.args_as_dict(), ) assistant_content_params.append(server_tool_use_block_param) elif isinstance(response_part, BuiltinToolReturnPart): if response_part.provider_name == self.system: tool_use_id = _guard_tool_call_id(t=response_part) if response_part.tool_name == 'web_search_tool_result': server_tool_result_block_param = BetaWebSearchToolResultBlockParam( tool_use_id=tool_use_id, type=response_part.tool_name, content=response_part.content ) elif response_part.tool_name == 'code_execution_tool_result': server_tool_result_block_param = BetaCodeExecutionToolResultBlockParam( tool_use_id=tool_use_id, type=response_part.tool_name, content=response_part.content ) else: raise ValueError(f'Unsupported tool name: {response_part.tool_name}') assistant_content_params.append(server_tool_result_block_param) else: assert_never(response_part) if len(assistant_content_params) > 0: anthropic_messages.append(BetaMessageParam(role='assistant', content=assistant_content_params)) else: assert_never(m) if instructions := self._get_instructions(messages): system_prompt_parts.insert(0, instructions) system_prompt = '\n\n'.join(system_prompt_parts) return system_prompt, anthropic_messages @staticmethod async def _map_user_prompt( part: UserPromptPart, ) -> AsyncGenerator[BetaContentBlockParam]: if isinstance(part.content, str): if part.content: # Only yield non-empty text yield BetaTextBlockParam(text=part.content, type='text') else: for item in part.content: if isinstance(item, str): if item: # Only yield non-empty text yield BetaTextBlockParam(text=item, type='text') elif isinstance(item, BinaryContent): if item.is_image: yield BetaImageBlockParam( source={'data': io.BytesIO(item.data), 'media_type': item.media_type, 'type': 'base64'}, # type: ignore type='image', ) elif item.media_type == 'application/pdf': yield BetaBase64PDFBlockParam( source=BetaBase64PDFSourceParam( data=io.BytesIO(item.data), media_type='application/pdf', type='base64', ), type='document', ) else: raise RuntimeError('Only images and PDFs are supported for binary content') elif isinstance(item, ImageUrl): yield BetaImageBlockParam(source={'type': 'url', 'url': item.url}, type='image') elif isinstance(item, DocumentUrl): if item.media_type == 'application/pdf': yield BetaBase64PDFBlockParam(source={'url': item.url, 'type': 'url'}, type='document') elif item.media_type == 'text/plain': downloaded_item = await download_item(item, data_format='text') yield BetaBase64PDFBlockParam( source=BetaPlainTextSourceParam( data=downloaded_item['data'], media_type=item.media_type, type='text' ), type='document', ) else: # pragma: no cover raise RuntimeError(f'Unsupported media type: {item.media_type}') else: raise RuntimeError(f'Unsupported content type: {type(item)}') # pragma: no cover @staticmethod def _map_tool_definition(f: ToolDefinition) -> BetaToolParam: return { 'name': f.name, 'description': f.description or '', 'input_schema': f.parameters_json_schema, } ``` #### __init__ ```python __init__( model_name: AnthropicModelName, *, provider: ( Literal["anthropic"] | Provider[AsyncAnthropicClient] ) = "anthropic", profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None ) ``` Initialize an Anthropic model. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `model_name` | `AnthropicModelName` | The name of the Anthropic model to use. List of model names available here. | *required* | | `provider` | `Literal['anthropic'] | Provider[AsyncAnthropicClient]` | The provider to use for the Anthropic API. Can be either the string 'anthropic' or an instance of Provider[AsyncAnthropicClient]. If not provided, the other parameters will be used. | `'anthropic'` | | `profile` | `ModelProfileSpec | None` | The model profile to use. Defaults to a profile picked by the provider based on the model name. | `None` | | `settings` | `ModelSettings | None` | Default model settings for this model instance. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/models/anthropic.py` ```python def __init__( self, model_name: AnthropicModelName, *, provider: Literal['anthropic'] | Provider[AsyncAnthropicClient] = 'anthropic', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize an Anthropic model. Args: model_name: The name of the Anthropic model to use. List of model names available [here](https://docs.anthropic.com/en/docs/about-claude/models). provider: The provider to use for the Anthropic API. Can be either the string 'anthropic' or an instance of `Provider[AsyncAnthropicClient]`. If not provided, the other parameters will be used. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. settings: Default model settings for this model instance. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider(provider) self._provider = provider self.client = provider.client super().__init__(settings=settings, profile=profile or provider.model_profile) ``` #### model_name ```python model_name: AnthropicModelName ``` The model name. #### system ```python system: str ``` The model provider. ### AnthropicStreamedResponse Bases: `StreamedResponse` Implementation of `StreamedResponse` for Anthropic models. Source code in `pydantic_ai_slim/pydantic_ai/models/anthropic.py` ```python @dataclass class AnthropicStreamedResponse(StreamedResponse): """Implementation of `StreamedResponse` for Anthropic models.""" _model_name: AnthropicModelName _response: AsyncIterable[BetaRawMessageStreamEvent] _timestamp: datetime _provider_name: str async def _get_event_iterator(self) -> AsyncIterator[ModelResponseStreamEvent]: # noqa: C901 current_block: BetaContentBlock | None = None async for event in self._response: if isinstance(event, BetaRawMessageStartEvent): self._usage = _map_usage(event) self.provider_response_id = event.message.id elif isinstance(event, BetaRawContentBlockStartEvent): current_block = event.content_block if isinstance(current_block, BetaTextBlock) and current_block.text: maybe_event = self._parts_manager.handle_text_delta( vendor_part_id=event.index, content=current_block.text ) if maybe_event is not None: # pragma: no branch yield maybe_event elif isinstance(current_block, BetaThinkingBlock): yield self._parts_manager.handle_thinking_delta( vendor_part_id=event.index, content=current_block.thinking, signature=current_block.signature, provider_name=self.provider_name, ) elif isinstance(current_block, BetaRedactedThinkingBlock): yield self._parts_manager.handle_thinking_delta( vendor_part_id=event.index, id='redacted_thinking', signature=current_block.data, provider_name=self.provider_name, ) elif isinstance(current_block, BetaToolUseBlock): maybe_event = self._parts_manager.handle_tool_call_delta( vendor_part_id=current_block.id, tool_name=current_block.name, args=cast(dict[str, Any], current_block.input) or None, tool_call_id=current_block.id, ) if maybe_event is not None: # pragma: no branch yield maybe_event elif isinstance(current_block, BetaServerToolUseBlock): pass elif isinstance(event, BetaRawContentBlockDeltaEvent): if isinstance(event.delta, BetaTextDelta): maybe_event = self._parts_manager.handle_text_delta( vendor_part_id=event.index, content=event.delta.text ) if maybe_event is not None: # pragma: no branch yield maybe_event elif isinstance(event.delta, BetaThinkingDelta): yield self._parts_manager.handle_thinking_delta( vendor_part_id=event.index, content=event.delta.thinking, provider_name=self.provider_name, ) elif isinstance(event.delta, BetaSignatureDelta): yield self._parts_manager.handle_thinking_delta( vendor_part_id=event.index, signature=event.delta.signature, provider_name=self.provider_name, ) elif ( current_block and event.delta.type == 'input_json_delta' and isinstance(current_block, BetaToolUseBlock) ): # pragma: no branch maybe_event = self._parts_manager.handle_tool_call_delta( vendor_part_id=current_block.id, tool_name='', args=event.delta.partial_json, tool_call_id=current_block.id, ) if maybe_event is not None: # pragma: no branch yield maybe_event elif isinstance(event.delta, BetaInputJSONDelta): pass # TODO(Marcelo): We need to handle citations. elif isinstance(event.delta, BetaCitationsDelta): pass elif isinstance(event, BetaRawMessageDeltaEvent): self._usage = _map_usage(event) if raw_finish_reason := event.delta.stop_reason: # pragma: no branch self.provider_details = {'finish_reason': raw_finish_reason} self.finish_reason = _FINISH_REASON_MAP.get(raw_finish_reason) elif isinstance(event, BetaRawContentBlockStopEvent | BetaRawMessageStopEvent): # pragma: no branch current_block = None @property def model_name(self) -> AnthropicModelName: """Get the model name of the response.""" return self._model_name @property def provider_name(self) -> str: """Get the provider name.""" return self._provider_name @property def timestamp(self) -> datetime: """Get the timestamp of the response.""" return self._timestamp ``` #### model_name ```python model_name: AnthropicModelName ``` Get the model name of the response. #### provider_name ```python provider_name: str ``` Get the provider name. #### timestamp ```python timestamp: datetime ``` Get the timestamp of the response. # `pydantic_ai.models` Logic related to making requests to an LLM. The aim here is to make a common interface for different LLMs, so that the rest of the code can be agnostic to the specific LLM being used. ### KnownModelName ```python KnownModelName = TypeAliasType( "KnownModelName", Literal[ "anthropic:claude-3-5-haiku-20241022", "anthropic:claude-3-5-haiku-latest", "anthropic:claude-3-5-sonnet-20240620", "anthropic:claude-3-5-sonnet-20241022", "anthropic:claude-3-5-sonnet-latest", "anthropic:claude-3-7-sonnet-20250219", "anthropic:claude-3-7-sonnet-latest", "anthropic:claude-3-haiku-20240307", "anthropic:claude-3-opus-20240229", "anthropic:claude-3-opus-latest", "anthropic:claude-4-opus-20250514", "anthropic:claude-4-sonnet-20250514", "anthropic:claude-opus-4-0", "anthropic:claude-opus-4-1-20250805", "anthropic:claude-opus-4-20250514", "anthropic:claude-sonnet-4-0", "anthropic:claude-sonnet-4-20250514", "bedrock:amazon.titan-tg1-large", "bedrock:amazon.titan-text-lite-v1", "bedrock:amazon.titan-text-express-v1", "bedrock:us.amazon.nova-pro-v1:0", "bedrock:us.amazon.nova-lite-v1:0", "bedrock:us.amazon.nova-micro-v1:0", "bedrock:anthropic.claude-3-5-sonnet-20241022-v2:0", "bedrock:us.anthropic.claude-3-5-sonnet-20241022-v2:0", "bedrock:anthropic.claude-3-5-haiku-20241022-v1:0", "bedrock:us.anthropic.claude-3-5-haiku-20241022-v1:0", "bedrock:anthropic.claude-instant-v1", "bedrock:anthropic.claude-v2:1", "bedrock:anthropic.claude-v2", "bedrock:anthropic.claude-3-sonnet-20240229-v1:0", "bedrock:us.anthropic.claude-3-sonnet-20240229-v1:0", "bedrock:anthropic.claude-3-haiku-20240307-v1:0", "bedrock:us.anthropic.claude-3-haiku-20240307-v1:0", "bedrock:anthropic.claude-3-opus-20240229-v1:0", "bedrock:us.anthropic.claude-3-opus-20240229-v1:0", "bedrock:anthropic.claude-3-5-sonnet-20240620-v1:0", "bedrock:us.anthropic.claude-3-5-sonnet-20240620-v1:0", "bedrock:anthropic.claude-3-7-sonnet-20250219-v1:0", "bedrock:us.anthropic.claude-3-7-sonnet-20250219-v1:0", "bedrock:anthropic.claude-opus-4-20250514-v1:0", "bedrock:us.anthropic.claude-opus-4-20250514-v1:0", "bedrock:anthropic.claude-sonnet-4-20250514-v1:0", "bedrock:us.anthropic.claude-sonnet-4-20250514-v1:0", "bedrock:cohere.command-text-v14", "bedrock:cohere.command-r-v1:0", "bedrock:cohere.command-r-plus-v1:0", "bedrock:cohere.command-light-text-v14", "bedrock:meta.llama3-8b-instruct-v1:0", "bedrock:meta.llama3-70b-instruct-v1:0", "bedrock:meta.llama3-1-8b-instruct-v1:0", "bedrock:us.meta.llama3-1-8b-instruct-v1:0", "bedrock:meta.llama3-1-70b-instruct-v1:0", "bedrock:us.meta.llama3-1-70b-instruct-v1:0", "bedrock:meta.llama3-1-405b-instruct-v1:0", "bedrock:us.meta.llama3-2-11b-instruct-v1:0", "bedrock:us.meta.llama3-2-90b-instruct-v1:0", "bedrock:us.meta.llama3-2-1b-instruct-v1:0", "bedrock:us.meta.llama3-2-3b-instruct-v1:0", "bedrock:us.meta.llama3-3-70b-instruct-v1:0", "bedrock:mistral.mistral-7b-instruct-v0:2", "bedrock:mistral.mixtral-8x7b-instruct-v0:1", "bedrock:mistral.mistral-large-2402-v1:0", "bedrock:mistral.mistral-large-2407-v1:0", "cerebras:gpt-oss-120b", "cerebras:llama3.1-8b", "cerebras:llama-3.3-70b", "cerebras:llama-4-scout-17b-16e-instruct", "cerebras:llama-4-maverick-17b-128e-instruct", "cerebras:qwen-3-235b-a22b-instruct-2507", "cerebras:qwen-3-32b", "cerebras:qwen-3-coder-480b", "cerebras:qwen-3-235b-a22b-thinking-2507", "claude-3-5-haiku-20241022", "claude-3-5-haiku-latest", "claude-3-5-sonnet-20240620", "claude-3-5-sonnet-20241022", "claude-3-5-sonnet-latest", "claude-3-7-sonnet-20250219", "claude-3-7-sonnet-latest", "claude-3-haiku-20240307", "claude-3-opus-20240229", "claude-3-opus-latest", "claude-4-opus-20250514", "claude-4-sonnet-20250514", "claude-opus-4-0", "claude-opus-4-1-20250805", "claude-opus-4-20250514", "claude-sonnet-4-0", "claude-sonnet-4-20250514", "cohere:c4ai-aya-expanse-32b", "cohere:c4ai-aya-expanse-8b", "cohere:command", "cohere:command-light", "cohere:command-light-nightly", "cohere:command-nightly", "cohere:command-r", "cohere:command-r-03-2024", "cohere:command-r-08-2024", "cohere:command-r-plus", "cohere:command-r-plus-04-2024", "cohere:command-r-plus-08-2024", "cohere:command-r7b-12-2024", "deepseek:deepseek-chat", "deepseek:deepseek-reasoner", "google-gla:gemini-2.0-flash", "google-gla:gemini-2.0-flash-lite", "google-gla:gemini-2.5-flash", "google-gla:gemini-2.5-flash-lite", "google-gla:gemini-2.5-pro", "google-vertex:gemini-2.0-flash", "google-vertex:gemini-2.0-flash-lite", "google-vertex:gemini-2.5-flash", "google-vertex:gemini-2.5-flash-lite", "google-vertex:gemini-2.5-pro", "gpt-3.5-turbo", "gpt-3.5-turbo-0125", "gpt-3.5-turbo-0301", "gpt-3.5-turbo-0613", "gpt-3.5-turbo-1106", "gpt-3.5-turbo-16k", "gpt-3.5-turbo-16k-0613", "gpt-4", "gpt-4-0125-preview", "gpt-4-0314", "gpt-4-0613", "gpt-4-1106-preview", "gpt-4-32k", "gpt-4-32k-0314", "gpt-4-32k-0613", "gpt-4-turbo", "gpt-4-turbo-2024-04-09", "gpt-4-turbo-preview", "gpt-4-vision-preview", "gpt-4.1", "gpt-4.1-2025-04-14", "gpt-4.1-mini", "gpt-4.1-mini-2025-04-14", "gpt-4.1-nano", "gpt-4.1-nano-2025-04-14", "gpt-4o", "gpt-4o-2024-05-13", "gpt-4o-2024-08-06", "gpt-4o-2024-11-20", "gpt-4o-audio-preview", "gpt-4o-audio-preview-2024-10-01", "gpt-4o-audio-preview-2024-12-17", "gpt-4o-audio-preview-2025-06-03", "gpt-4o-mini", "gpt-4o-mini-2024-07-18", "gpt-4o-mini-audio-preview", "gpt-4o-mini-audio-preview-2024-12-17", "gpt-4o-mini-search-preview", "gpt-4o-mini-search-preview-2025-03-11", "gpt-4o-search-preview", "gpt-4o-search-preview-2025-03-11", "gpt-5", "gpt-5-2025-08-07", "gpt-5-chat-latest", "gpt-5-mini", "gpt-5-mini-2025-08-07", "gpt-5-nano", "gpt-5-nano-2025-08-07", "grok:grok-4", "grok:grok-4-0709", "grok:grok-3", "grok:grok-3-mini", "grok:grok-3-fast", "grok:grok-3-mini-fast", "grok:grok-2-vision-1212", "grok:grok-2-image-1212", "groq:distil-whisper-large-v3-en", "groq:gemma2-9b-it", "groq:llama-3.3-70b-versatile", "groq:llama-3.1-8b-instant", "groq:llama-guard-3-8b", "groq:llama3-70b-8192", "groq:llama3-8b-8192", "groq:moonshotai/kimi-k2-instruct", "groq:whisper-large-v3", "groq:whisper-large-v3-turbo", "groq:playai-tts", "groq:playai-tts-arabic", "groq:qwen-qwq-32b", "groq:mistral-saba-24b", "groq:qwen-2.5-coder-32b", "groq:qwen-2.5-32b", "groq:deepseek-r1-distill-qwen-32b", "groq:deepseek-r1-distill-llama-70b", "groq:llama-3.3-70b-specdec", "groq:llama-3.2-1b-preview", "groq:llama-3.2-3b-preview", "groq:llama-3.2-11b-vision-preview", "groq:llama-3.2-90b-vision-preview", "heroku:claude-3-5-haiku", "heroku:claude-3-5-sonnet-latest", "heroku:claude-3-7-sonnet", "heroku:claude-4-sonnet", "heroku:claude-3-haiku", "heroku:gpt-oss-120b", "heroku:nova-lite", "heroku:nova-pro", "huggingface:Qwen/QwQ-32B", "huggingface:Qwen/Qwen2.5-72B-Instruct", "huggingface:Qwen/Qwen3-235B-A22B", "huggingface:Qwen/Qwen3-32B", "huggingface:deepseek-ai/DeepSeek-R1", "huggingface:meta-llama/Llama-3.3-70B-Instruct", "huggingface:meta-llama/Llama-4-Maverick-17B-128E-Instruct", "huggingface:meta-llama/Llama-4-Scout-17B-16E-Instruct", "mistral:codestral-latest", "mistral:mistral-large-latest", "mistral:mistral-moderation-latest", "mistral:mistral-small-latest", "moonshotai:moonshot-v1-8k", "moonshotai:moonshot-v1-32k", "moonshotai:moonshot-v1-128k", "moonshotai:moonshot-v1-8k-vision-preview", "moonshotai:moonshot-v1-32k-vision-preview", "moonshotai:moonshot-v1-128k-vision-preview", "moonshotai:kimi-latest", "moonshotai:kimi-thinking-preview", "moonshotai:kimi-k2-0711-preview", "o1", "o1-2024-12-17", "o1-mini", "o1-mini-2024-09-12", "o1-preview", "o1-preview-2024-09-12", "o1-pro", "o1-pro-2025-03-19", "o3", "o3-2025-04-16", "o3-deep-research", "o3-deep-research-2025-06-26", "o3-mini", "o3-mini-2025-01-31", "o3-pro", "o3-pro-2025-06-10", "openai:chatgpt-4o-latest", "openai:codex-mini-latest", "openai:gpt-3.5-turbo", "openai:gpt-3.5-turbo-0125", "openai:gpt-3.5-turbo-0301", "openai:gpt-3.5-turbo-0613", "openai:gpt-3.5-turbo-1106", "openai:gpt-3.5-turbo-16k", "openai:gpt-3.5-turbo-16k-0613", "openai:gpt-4", "openai:gpt-4-0125-preview", "openai:gpt-4-0314", "openai:gpt-4-0613", "openai:gpt-4-1106-preview", "openai:gpt-4-32k", "openai:gpt-4-32k-0314", "openai:gpt-4-32k-0613", "openai:gpt-4-turbo", "openai:gpt-4-turbo-2024-04-09", "openai:gpt-4-turbo-preview", "openai:gpt-4-vision-preview", "openai:gpt-4.1", "openai:gpt-4.1-2025-04-14", "openai:gpt-4.1-mini", "openai:gpt-4.1-mini-2025-04-14", "openai:gpt-4.1-nano", "openai:gpt-4.1-nano-2025-04-14", "openai:gpt-4o", "openai:gpt-4o-2024-05-13", "openai:gpt-4o-2024-08-06", "openai:gpt-4o-2024-11-20", "openai:gpt-4o-audio-preview", "openai:gpt-4o-audio-preview-2024-10-01", "openai:gpt-4o-audio-preview-2024-12-17", "openai:gpt-4o-audio-preview-2025-06-03", "openai:gpt-4o-mini", "openai:gpt-4o-mini-2024-07-18", "openai:gpt-4o-mini-audio-preview", "openai:gpt-4o-mini-audio-preview-2024-12-17", "openai:gpt-4o-mini-search-preview", "openai:gpt-4o-mini-search-preview-2025-03-11", "openai:gpt-4o-search-preview", "openai:gpt-4o-search-preview-2025-03-11", "openai:gpt-5", "openai:gpt-5-2025-08-07", "openai:o1", "openai:gpt-5-chat-latest", "openai:o1-2024-12-17", "openai:gpt-5-mini", "openai:o1-mini", "openai:gpt-5-mini-2025-08-07", "openai:o1-mini-2024-09-12", "openai:gpt-5-nano", "openai:o1-preview", "openai:gpt-5-nano-2025-08-07", "openai:o1-preview-2024-09-12", "openai:o1-pro", "openai:o1-pro-2025-03-19", "openai:o3", "openai:o3-2025-04-16", "openai:o3-deep-research", "openai:o3-deep-research-2025-06-26", "openai:o3-mini", "openai:o3-mini-2025-01-31", "openai:o4-mini", "openai:o4-mini-2025-04-16", "openai:o4-mini-deep-research", "openai:o4-mini-deep-research-2025-06-26", "openai:o3-pro", "openai:o3-pro-2025-06-10", "openai:computer-use-preview", "openai:computer-use-preview-2025-03-11", "test", ], ) ``` Known model names that can be used with the `model` parameter of Agent. `KnownModelName` is provided as a concise way to specify a model. ### ModelRequestParameters Configuration for an agent's request to a model, specifically related to tools and output handling. Source code in `pydantic_ai_slim/pydantic_ai/models/__init__.py` ```python @dataclass(repr=False, kw_only=True) class ModelRequestParameters: """Configuration for an agent's request to a model, specifically related to tools and output handling.""" function_tools: list[ToolDefinition] = field(default_factory=list) builtin_tools: list[AbstractBuiltinTool] = field(default_factory=list) output_mode: OutputMode = 'text' output_object: OutputObjectDefinition | None = None output_tools: list[ToolDefinition] = field(default_factory=list) allow_text_output: bool = True @cached_property def tool_defs(self) -> dict[str, ToolDefinition]: return {tool_def.name: tool_def for tool_def in [*self.function_tools, *self.output_tools]} __repr__ = _utils.dataclasses_no_defaults_repr ``` ### Model Bases: `ABC` Abstract class for a model. Source code in `pydantic_ai_slim/pydantic_ai/models/__init__.py` ```python class Model(ABC): """Abstract class for a model.""" _profile: ModelProfileSpec | None = None _settings: ModelSettings | None = None def __init__( self, *, settings: ModelSettings | None = None, profile: ModelProfileSpec | None = None, ) -> None: """Initialize the model with optional settings and profile. Args: settings: Model-specific settings that will be used as defaults for this model. profile: The model profile to use. """ self._settings = settings self._profile = profile @property def settings(self) -> ModelSettings | None: """Get the model settings.""" return self._settings @abstractmethod async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: """Make a request to the model.""" raise NotImplementedError() async def count_tokens( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> RequestUsage: """Make a request to the model for counting tokens.""" # This method is not required, but you need to implement it if you want to support `UsageLimits.count_tokens_before_request`. raise NotImplementedError(f'Token counting ahead of the request is not supported by {self.__class__.__name__}') @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: """Make a request to the model and return a streaming response.""" # This method is not required, but you need to implement it if you want to support streamed responses raise NotImplementedError(f'Streamed requests not supported by this {self.__class__.__name__}') # yield is required to make this a generator for type checking # noinspection PyUnreachableCode yield # pragma: no cover def customize_request_parameters(self, model_request_parameters: ModelRequestParameters) -> ModelRequestParameters: """Customize the request parameters for the model. This method can be overridden by subclasses to modify the request parameters before sending them to the model. In particular, this method can be used to make modifications to the generated tool JSON schemas if necessary for vendor/model-specific reasons. """ if transformer := self.profile.json_schema_transformer: model_request_parameters = replace( model_request_parameters, function_tools=[_customize_tool_def(transformer, t) for t in model_request_parameters.function_tools], output_tools=[_customize_tool_def(transformer, t) for t in model_request_parameters.output_tools], ) if output_object := model_request_parameters.output_object: model_request_parameters = replace( model_request_parameters, output_object=_customize_output_object(transformer, output_object), ) return model_request_parameters @property @abstractmethod def model_name(self) -> str: """The model name.""" raise NotImplementedError() @cached_property def profile(self) -> ModelProfile: """The model profile.""" _profile = self._profile if callable(_profile): _profile = _profile(self.model_name) if _profile is None: return DEFAULT_PROFILE return _profile @property @abstractmethod def system(self) -> str: """The model provider, ex: openai. Use to populate the `gen_ai.system` OpenTelemetry semantic convention attribute, so should use well-known values listed in https://opentelemetry.io/docs/specs/semconv/attributes-registry/gen-ai/#gen-ai-system when applicable. """ raise NotImplementedError() @property def base_url(self) -> str | None: """The base URL for the provider API, if available.""" return None @staticmethod def _get_instructions(messages: list[ModelMessage]) -> str | None: """Get instructions from the first ModelRequest found when iterating messages in reverse. In the case that a "mock" request was generated to include a tool-return part for a result tool, we want to use the instructions from the second-to-most-recent request (which should correspond to the original request that generated the response that resulted in the tool-return part). """ last_two_requests: list[ModelRequest] = [] for message in reversed(messages): if isinstance(message, ModelRequest): last_two_requests.append(message) if len(last_two_requests) == 2: break if message.instructions is not None: return message.instructions # If we don't have two requests, and we didn't already return instructions, there are definitely not any: if len(last_two_requests) != 2: return None most_recent_request = last_two_requests[0] second_most_recent_request = last_two_requests[1] # If we've gotten this far and the most recent request consists of only tool-return parts or retry-prompt parts, # we use the instructions from the second-to-most-recent request. This is necessary because when handling # result tools, we generate a "mock" ModelRequest with a tool-return part for it, and that ModelRequest will not # have the relevant instructions from the agent. # While it's possible that you could have a message history where the most recent request has only tool returns, # I believe there is no way to achieve that would _change_ the instructions without manually crafting the most # recent message. That might make sense in principle for some usage pattern, but it's enough of an edge case # that I think it's not worth worrying about, since you can work around this by inserting another ModelRequest # with no parts at all immediately before the request that has the tool calls (that works because we only look # at the two most recent ModelRequests here). # If you have a use case where this causes pain, please open a GitHub issue and we can discuss alternatives. if all(p.part_kind == 'tool-return' or p.part_kind == 'retry-prompt' for p in most_recent_request.parts): return second_most_recent_request.instructions return None ``` #### __init__ ```python __init__( *, settings: ModelSettings | None = None, profile: ModelProfileSpec | None = None ) -> None ``` Initialize the model with optional settings and profile. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `settings` | `ModelSettings | None` | Model-specific settings that will be used as defaults for this model. | `None` | | `profile` | `ModelProfileSpec | None` | The model profile to use. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/models/__init__.py` ```python def __init__( self, *, settings: ModelSettings | None = None, profile: ModelProfileSpec | None = None, ) -> None: """Initialize the model with optional settings and profile. Args: settings: Model-specific settings that will be used as defaults for this model. profile: The model profile to use. """ self._settings = settings self._profile = profile ``` #### settings ```python settings: ModelSettings | None ``` Get the model settings. #### request ```python request( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse ``` Make a request to the model. Source code in `pydantic_ai_slim/pydantic_ai/models/__init__.py` ```python @abstractmethod async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: """Make a request to the model.""" raise NotImplementedError() ``` #### count_tokens ```python count_tokens( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> RequestUsage ``` Make a request to the model for counting tokens. Source code in `pydantic_ai_slim/pydantic_ai/models/__init__.py` ```python async def count_tokens( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> RequestUsage: """Make a request to the model for counting tokens.""" # This method is not required, but you need to implement it if you want to support `UsageLimits.count_tokens_before_request`. raise NotImplementedError(f'Token counting ahead of the request is not supported by {self.__class__.__name__}') ``` #### request_stream ```python request_stream( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse] ``` Make a request to the model and return a streaming response. Source code in `pydantic_ai_slim/pydantic_ai/models/__init__.py` ```python @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: """Make a request to the model and return a streaming response.""" # This method is not required, but you need to implement it if you want to support streamed responses raise NotImplementedError(f'Streamed requests not supported by this {self.__class__.__name__}') # yield is required to make this a generator for type checking # noinspection PyUnreachableCode yield # pragma: no cover ``` #### customize_request_parameters ```python customize_request_parameters( model_request_parameters: ModelRequestParameters, ) -> ModelRequestParameters ``` Customize the request parameters for the model. This method can be overridden by subclasses to modify the request parameters before sending them to the model. In particular, this method can be used to make modifications to the generated tool JSON schemas if necessary for vendor/model-specific reasons. Source code in `pydantic_ai_slim/pydantic_ai/models/__init__.py` ```python def customize_request_parameters(self, model_request_parameters: ModelRequestParameters) -> ModelRequestParameters: """Customize the request parameters for the model. This method can be overridden by subclasses to modify the request parameters before sending them to the model. In particular, this method can be used to make modifications to the generated tool JSON schemas if necessary for vendor/model-specific reasons. """ if transformer := self.profile.json_schema_transformer: model_request_parameters = replace( model_request_parameters, function_tools=[_customize_tool_def(transformer, t) for t in model_request_parameters.function_tools], output_tools=[_customize_tool_def(transformer, t) for t in model_request_parameters.output_tools], ) if output_object := model_request_parameters.output_object: model_request_parameters = replace( model_request_parameters, output_object=_customize_output_object(transformer, output_object), ) return model_request_parameters ``` #### model_name ```python model_name: str ``` The model name. #### profile ```python profile: ModelProfile ``` The model profile. #### system ```python system: str ``` The model provider, ex: openai. Use to populate the `gen_ai.system` OpenTelemetry semantic convention attribute, so should use well-known values listed in https://opentelemetry.io/docs/specs/semconv/attributes-registry/gen-ai/#gen-ai-system when applicable. #### base_url ```python base_url: str | None ``` The base URL for the provider API, if available. ### StreamedResponse Bases: `ABC` Streamed response from an LLM when calling a tool. Source code in `pydantic_ai_slim/pydantic_ai/models/__init__.py` ```python @dataclass class StreamedResponse(ABC): """Streamed response from an LLM when calling a tool.""" model_request_parameters: ModelRequestParameters final_result_event: FinalResultEvent | None = field(default=None, init=False) provider_response_id: str | None = field(default=None, init=False) provider_details: dict[str, Any] | None = field(default=None, init=False) finish_reason: FinishReason | None = field(default=None, init=False) _parts_manager: ModelResponsePartsManager = field(default_factory=ModelResponsePartsManager, init=False) _event_iterator: AsyncIterator[ModelResponseStreamEvent] | None = field(default=None, init=False) _usage: RequestUsage = field(default_factory=RequestUsage, init=False) def __aiter__(self) -> AsyncIterator[ModelResponseStreamEvent]: """Stream the response as an async iterable of [`ModelResponseStreamEvent`][pydantic_ai.messages.ModelResponseStreamEvent]s. This proxies the `_event_iterator()` and emits all events, while also checking for matches on the result schema and emitting a [`FinalResultEvent`][pydantic_ai.messages.FinalResultEvent] if/when the first match is found. """ if self._event_iterator is None: async def iterator_with_final_event( iterator: AsyncIterator[ModelResponseStreamEvent], ) -> AsyncIterator[ModelResponseStreamEvent]: async for event in iterator: yield event if ( final_result_event := _get_final_result_event(event, self.model_request_parameters) ) is not None: self.final_result_event = final_result_event yield final_result_event break # If we broke out of the above loop, we need to yield the rest of the events # If we didn't, this will just be a no-op async for event in iterator: yield event self._event_iterator = iterator_with_final_event(self._get_event_iterator()) return self._event_iterator @abstractmethod async def _get_event_iterator(self) -> AsyncIterator[ModelResponseStreamEvent]: """Return an async iterator of [`ModelResponseStreamEvent`][pydantic_ai.messages.ModelResponseStreamEvent]s. This method should be implemented by subclasses to translate the vendor-specific stream of events into pydantic_ai-format events. It should use the `_parts_manager` to handle deltas, and should update the `_usage` attributes as it goes. """ raise NotImplementedError() # noinspection PyUnreachableCode yield def get(self) -> ModelResponse: """Build a [`ModelResponse`][pydantic_ai.messages.ModelResponse] from the data received from the stream so far.""" return ModelResponse( parts=self._parts_manager.get_parts(), model_name=self.model_name, timestamp=self.timestamp, usage=self.usage(), provider_name=self.provider_name, provider_response_id=self.provider_response_id, provider_details=self.provider_details, finish_reason=self.finish_reason, ) def usage(self) -> RequestUsage: """Get the usage of the response so far. This will not be the final usage until the stream is exhausted.""" return self._usage @property @abstractmethod def model_name(self) -> str: """Get the model name of the response.""" raise NotImplementedError() @property @abstractmethod def provider_name(self) -> str | None: """Get the provider name.""" raise NotImplementedError() @property @abstractmethod def timestamp(self) -> datetime: """Get the timestamp of the response.""" raise NotImplementedError() ``` #### __aiter__ ```python __aiter__() -> AsyncIterator[ModelResponseStreamEvent] ``` Stream the response as an async iterable of ModelResponseStreamEvents. This proxies the `_event_iterator()` and emits all events, while also checking for matches on the result schema and emitting a FinalResultEvent if/when the first match is found. Source code in `pydantic_ai_slim/pydantic_ai/models/__init__.py` ```python def __aiter__(self) -> AsyncIterator[ModelResponseStreamEvent]: """Stream the response as an async iterable of [`ModelResponseStreamEvent`][pydantic_ai.messages.ModelResponseStreamEvent]s. This proxies the `_event_iterator()` and emits all events, while also checking for matches on the result schema and emitting a [`FinalResultEvent`][pydantic_ai.messages.FinalResultEvent] if/when the first match is found. """ if self._event_iterator is None: async def iterator_with_final_event( iterator: AsyncIterator[ModelResponseStreamEvent], ) -> AsyncIterator[ModelResponseStreamEvent]: async for event in iterator: yield event if ( final_result_event := _get_final_result_event(event, self.model_request_parameters) ) is not None: self.final_result_event = final_result_event yield final_result_event break # If we broke out of the above loop, we need to yield the rest of the events # If we didn't, this will just be a no-op async for event in iterator: yield event self._event_iterator = iterator_with_final_event(self._get_event_iterator()) return self._event_iterator ``` #### get ```python get() -> ModelResponse ``` Build a ModelResponse from the data received from the stream so far. Source code in `pydantic_ai_slim/pydantic_ai/models/__init__.py` ```python def get(self) -> ModelResponse: """Build a [`ModelResponse`][pydantic_ai.messages.ModelResponse] from the data received from the stream so far.""" return ModelResponse( parts=self._parts_manager.get_parts(), model_name=self.model_name, timestamp=self.timestamp, usage=self.usage(), provider_name=self.provider_name, provider_response_id=self.provider_response_id, provider_details=self.provider_details, finish_reason=self.finish_reason, ) ``` #### usage ```python usage() -> RequestUsage ``` Get the usage of the response so far. This will not be the final usage until the stream is exhausted. Source code in `pydantic_ai_slim/pydantic_ai/models/__init__.py` ```python def usage(self) -> RequestUsage: """Get the usage of the response so far. This will not be the final usage until the stream is exhausted.""" return self._usage ``` #### model_name ```python model_name: str ``` Get the model name of the response. #### provider_name ```python provider_name: str | None ``` Get the provider name. #### timestamp ```python timestamp: datetime ``` Get the timestamp of the response. ### ALLOW_MODEL_REQUESTS ```python ALLOW_MODEL_REQUESTS = True ``` Whether to allow requests to models. This global setting allows you to disable request to most models, e.g. to make sure you don't accidentally make costly requests to a model during tests. The testing models TestModel and FunctionModel are no affected by this setting. ### check_allow_model_requests ```python check_allow_model_requests() -> None ``` Check if model requests are allowed. If you're defining your own models that have costs or latency associated with their use, you should call this in Model.request and Model.request_stream. Raises: | Type | Description | | --- | --- | | `RuntimeError` | If model requests are not allowed. | Source code in `pydantic_ai_slim/pydantic_ai/models/__init__.py` ```python def check_allow_model_requests() -> None: """Check if model requests are allowed. If you're defining your own models that have costs or latency associated with their use, you should call this in [`Model.request`][pydantic_ai.models.Model.request] and [`Model.request_stream`][pydantic_ai.models.Model.request_stream]. Raises: RuntimeError: If model requests are not allowed. """ if not ALLOW_MODEL_REQUESTS: raise RuntimeError('Model requests are not allowed, since ALLOW_MODEL_REQUESTS is False') ``` ### override_allow_model_requests ```python override_allow_model_requests( allow_model_requests: bool, ) -> Iterator[None] ``` Context manager to temporarily override ALLOW_MODEL_REQUESTS. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `allow_model_requests` | `bool` | Whether to allow model requests within the context. | *required* | Source code in `pydantic_ai_slim/pydantic_ai/models/__init__.py` ```python @contextmanager def override_allow_model_requests(allow_model_requests: bool) -> Iterator[None]: """Context manager to temporarily override [`ALLOW_MODEL_REQUESTS`][pydantic_ai.models.ALLOW_MODEL_REQUESTS]. Args: allow_model_requests: Whether to allow model requests within the context. """ global ALLOW_MODEL_REQUESTS old_value = ALLOW_MODEL_REQUESTS ALLOW_MODEL_REQUESTS = allow_model_requests # pyright: ignore[reportConstantRedefinition] try: yield finally: ALLOW_MODEL_REQUESTS = old_value # pyright: ignore[reportConstantRedefinition] ``` # `pydantic_ai.models.bedrock` ## Setup For details on how to set up authentication with this model, see [model configuration for Bedrock](../../../models/bedrock/). ### LatestBedrockModelNames ```python LatestBedrockModelNames = Literal[ "amazon.titan-tg1-large", "amazon.titan-text-lite-v1", "amazon.titan-text-express-v1", "us.amazon.nova-pro-v1:0", "us.amazon.nova-lite-v1:0", "us.amazon.nova-micro-v1:0", "anthropic.claude-3-5-sonnet-20241022-v2:0", "us.anthropic.claude-3-5-sonnet-20241022-v2:0", "anthropic.claude-3-5-haiku-20241022-v1:0", "us.anthropic.claude-3-5-haiku-20241022-v1:0", "anthropic.claude-instant-v1", "anthropic.claude-v2:1", "anthropic.claude-v2", "anthropic.claude-3-sonnet-20240229-v1:0", "us.anthropic.claude-3-sonnet-20240229-v1:0", "anthropic.claude-3-haiku-20240307-v1:0", "us.anthropic.claude-3-haiku-20240307-v1:0", "anthropic.claude-3-opus-20240229-v1:0", "us.anthropic.claude-3-opus-20240229-v1:0", "anthropic.claude-3-5-sonnet-20240620-v1:0", "us.anthropic.claude-3-5-sonnet-20240620-v1:0", "anthropic.claude-3-7-sonnet-20250219-v1:0", "us.anthropic.claude-3-7-sonnet-20250219-v1:0", "anthropic.claude-opus-4-20250514-v1:0", "us.anthropic.claude-opus-4-20250514-v1:0", "anthropic.claude-sonnet-4-20250514-v1:0", "us.anthropic.claude-sonnet-4-20250514-v1:0", "cohere.command-text-v14", "cohere.command-r-v1:0", "cohere.command-r-plus-v1:0", "cohere.command-light-text-v14", "meta.llama3-8b-instruct-v1:0", "meta.llama3-70b-instruct-v1:0", "meta.llama3-1-8b-instruct-v1:0", "us.meta.llama3-1-8b-instruct-v1:0", "meta.llama3-1-70b-instruct-v1:0", "us.meta.llama3-1-70b-instruct-v1:0", "meta.llama3-1-405b-instruct-v1:0", "us.meta.llama3-2-11b-instruct-v1:0", "us.meta.llama3-2-90b-instruct-v1:0", "us.meta.llama3-2-1b-instruct-v1:0", "us.meta.llama3-2-3b-instruct-v1:0", "us.meta.llama3-3-70b-instruct-v1:0", "mistral.mistral-7b-instruct-v0:2", "mistral.mixtral-8x7b-instruct-v0:1", "mistral.mistral-large-2402-v1:0", "mistral.mistral-large-2407-v1:0", ] ``` Latest Bedrock models. ### BedrockModelName ```python BedrockModelName = str | LatestBedrockModelNames ``` Possible Bedrock model names. Since Bedrock supports a variety of date-stamped models, we explicitly list the latest models but allow any name in the type hints. See [the Bedrock docs](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html) for a full list. ### BedrockModelSettings Bases: `ModelSettings` Settings for Bedrock models. See [the Bedrock Converse API docs](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html#API_runtime_Converse_RequestSyntax) for a full list. See [the boto3 implementation](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock-runtime/client/converse.html) of the Bedrock Converse API. Source code in `pydantic_ai_slim/pydantic_ai/models/bedrock.py` ```python class BedrockModelSettings(ModelSettings, total=False): """Settings for Bedrock models. See [the Bedrock Converse API docs](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html#API_runtime_Converse_RequestSyntax) for a full list. See [the boto3 implementation](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock-runtime/client/converse.html) of the Bedrock Converse API. """ # ALL FIELDS MUST BE `bedrock_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. bedrock_guardrail_config: GuardrailConfigurationTypeDef """Content moderation and safety settings for Bedrock API requests. See more about it on . """ bedrock_performance_configuration: PerformanceConfigurationTypeDef """Performance optimization settings for model inference. See more about it on . """ bedrock_request_metadata: dict[str, str] """Additional metadata to attach to Bedrock API requests. See more about it on . """ bedrock_additional_model_response_fields_paths: list[str] """JSON paths to extract additional fields from model responses. See more about it on . """ bedrock_prompt_variables: Mapping[str, PromptVariableValuesTypeDef] """Variables for substitution into prompt templates. See more about it on . """ bedrock_additional_model_requests_fields: Mapping[str, Any] """Additional model-specific parameters to include in requests. See more about it on . """ ``` #### bedrock_guardrail_config ```python bedrock_guardrail_config: GuardrailConfigurationTypeDef ``` Content moderation and safety settings for Bedrock API requests. See more about it on . #### bedrock_performance_configuration ```python bedrock_performance_configuration: ( PerformanceConfigurationTypeDef ) ``` Performance optimization settings for model inference. See more about it on . #### bedrock_request_metadata ```python bedrock_request_metadata: dict[str, str] ``` Additional metadata to attach to Bedrock API requests. See more about it on . #### bedrock_additional_model_response_fields_paths ```python bedrock_additional_model_response_fields_paths: list[str] ``` JSON paths to extract additional fields from model responses. See more about it on . #### bedrock_prompt_variables ```python bedrock_prompt_variables: Mapping[ str, PromptVariableValuesTypeDef ] ``` Variables for substitution into prompt templates. See more about it on . #### bedrock_additional_model_requests_fields ```python bedrock_additional_model_requests_fields: Mapping[str, Any] ``` Additional model-specific parameters to include in requests. See more about it on . ### BedrockConverseModel Bases: `Model` A model that uses the Bedrock Converse API. Source code in `pydantic_ai_slim/pydantic_ai/models/bedrock.py` ```python @dataclass(init=False) class BedrockConverseModel(Model): """A model that uses the Bedrock Converse API.""" client: BedrockRuntimeClient _model_name: BedrockModelName = field(repr=False) _provider: Provider[BaseClient] = field(repr=False) def __init__( self, model_name: BedrockModelName, *, provider: Literal['bedrock'] | Provider[BaseClient] = 'bedrock', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize a Bedrock model. Args: model_name: The name of the model to use. model_name: The name of the Bedrock model to use. List of model names available [here](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html). provider: The provider to use for authentication and API access. Can be either the string 'bedrock' or an instance of `Provider[BaseClient]`. If not provided, a new provider will be created using the other parameters. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. settings: Model-specific settings that will be used as defaults for this model. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider(provider) self._provider = provider self.client = cast('BedrockRuntimeClient', provider.client) super().__init__(settings=settings, profile=profile or provider.model_profile) @property def base_url(self) -> str: return str(self.client.meta.endpoint_url) @property def model_name(self) -> str: """The model name.""" return self._model_name @property def system(self) -> str: """The model provider.""" return self._provider.name def _get_tools(self, model_request_parameters: ModelRequestParameters) -> list[ToolTypeDef]: return [self._map_tool_definition(r) for r in model_request_parameters.tool_defs.values()] @staticmethod def _map_tool_definition(f: ToolDefinition) -> ToolTypeDef: tool_spec: ToolSpecificationTypeDef = {'name': f.name, 'inputSchema': {'json': f.parameters_json_schema}} if f.description: # pragma: no branch tool_spec['description'] = f.description return {'toolSpec': tool_spec} async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: settings = cast(BedrockModelSettings, model_settings or {}) response = await self._messages_create(messages, False, settings, model_request_parameters) model_response = await self._process_response(response) return model_response @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: settings = cast(BedrockModelSettings, model_settings or {}) response = await self._messages_create(messages, True, settings, model_request_parameters) yield BedrockStreamedResponse( model_request_parameters=model_request_parameters, _model_name=self.model_name, _event_stream=response['stream'], _provider_name=self._provider.name, _provider_response_id=response.get('ResponseMetadata', {}).get('RequestId', None), ) async def _process_response(self, response: ConverseResponseTypeDef) -> ModelResponse: items: list[ModelResponsePart] = [] if message := response['output'].get('message'): # pragma: no branch for item in message['content']: if reasoning_content := item.get('reasoningContent'): if redacted_content := reasoning_content.get('redactedContent'): items.append( ThinkingPart( id='redacted_content', content='', signature=redacted_content.decode('utf-8'), provider_name=self.system, ) ) elif reasoning_text := reasoning_content.get('reasoningText'): # pragma: no branch signature = reasoning_text.get('signature') items.append( ThinkingPart( content=reasoning_text['text'], signature=signature, provider_name=self.system if signature else None, ) ) if text := item.get('text'): items.append(TextPart(content=text)) elif tool_use := item.get('toolUse'): items.append( ToolCallPart( tool_name=tool_use['name'], args=tool_use['input'], tool_call_id=tool_use['toolUseId'], ), ) u = usage.RequestUsage( input_tokens=response['usage']['inputTokens'], output_tokens=response['usage']['outputTokens'], ) response_id = response.get('ResponseMetadata', {}).get('RequestId', None) raw_finish_reason = response['stopReason'] provider_details = {'finish_reason': raw_finish_reason} finish_reason = _FINISH_REASON_MAP.get(raw_finish_reason) return ModelResponse( parts=items, usage=u, model_name=self.model_name, provider_response_id=response_id, provider_name=self._provider.name, finish_reason=finish_reason, provider_details=provider_details, ) @overload async def _messages_create( self, messages: list[ModelMessage], stream: Literal[True], model_settings: BedrockModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ConverseStreamResponseTypeDef: pass @overload async def _messages_create( self, messages: list[ModelMessage], stream: Literal[False], model_settings: BedrockModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ConverseResponseTypeDef: pass async def _messages_create( self, messages: list[ModelMessage], stream: bool, model_settings: BedrockModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ConverseResponseTypeDef | ConverseStreamResponseTypeDef: system_prompt, bedrock_messages = await self._map_messages(messages) inference_config = self._map_inference_config(model_settings) params: ConverseRequestTypeDef = { 'modelId': self.model_name, 'messages': bedrock_messages, 'system': system_prompt, 'inferenceConfig': inference_config, } tool_config = self._map_tool_config(model_request_parameters) if tool_config: params['toolConfig'] = tool_config if model_request_parameters.builtin_tools: raise UserError('Bedrock does not support built-in tools') # Bedrock supports a set of specific extra parameters if model_settings: if guardrail_config := model_settings.get('bedrock_guardrail_config', None): params['guardrailConfig'] = guardrail_config if performance_configuration := model_settings.get('bedrock_performance_configuration', None): params['performanceConfig'] = performance_configuration if request_metadata := model_settings.get('bedrock_request_metadata', None): params['requestMetadata'] = request_metadata if additional_model_response_fields_paths := model_settings.get( 'bedrock_additional_model_response_fields_paths', None ): params['additionalModelResponseFieldPaths'] = additional_model_response_fields_paths if additional_model_requests_fields := model_settings.get('bedrock_additional_model_requests_fields', None): params['additionalModelRequestFields'] = additional_model_requests_fields if prompt_variables := model_settings.get('bedrock_prompt_variables', None): params['promptVariables'] = prompt_variables if stream: model_response = await anyio.to_thread.run_sync(functools.partial(self.client.converse_stream, **params)) else: model_response = await anyio.to_thread.run_sync(functools.partial(self.client.converse, **params)) return model_response @staticmethod def _map_inference_config( model_settings: ModelSettings | None, ) -> InferenceConfigurationTypeDef: model_settings = model_settings or {} inference_config: InferenceConfigurationTypeDef = {} if max_tokens := model_settings.get('max_tokens'): inference_config['maxTokens'] = max_tokens if (temperature := model_settings.get('temperature')) is not None: inference_config['temperature'] = temperature if top_p := model_settings.get('top_p'): inference_config['topP'] = top_p if stop_sequences := model_settings.get('stop_sequences'): inference_config['stopSequences'] = stop_sequences return inference_config def _map_tool_config(self, model_request_parameters: ModelRequestParameters) -> ToolConfigurationTypeDef | None: tools = self._get_tools(model_request_parameters) if not tools: return None tool_choice: ToolChoiceTypeDef if not model_request_parameters.allow_text_output: tool_choice = {'any': {}} else: tool_choice = {'auto': {}} tool_config: ToolConfigurationTypeDef = {'tools': tools} if tool_choice and BedrockModelProfile.from_profile(self.profile).bedrock_supports_tool_choice: tool_config['toolChoice'] = tool_choice return tool_config async def _map_messages( # noqa: C901 self, messages: list[ModelMessage] ) -> tuple[list[SystemContentBlockTypeDef], list[MessageUnionTypeDef]]: """Maps a `pydantic_ai.Message` to the Bedrock `MessageUnionTypeDef`. Groups consecutive ToolReturnPart objects into a single user message as required by Bedrock Claude/Nova models. """ profile = BedrockModelProfile.from_profile(self.profile) system_prompt: list[SystemContentBlockTypeDef] = [] bedrock_messages: list[MessageUnionTypeDef] = [] document_count: Iterator[int] = count(1) for message in messages: if isinstance(message, ModelRequest): for part in message.parts: if isinstance(part, SystemPromptPart) and part.content: system_prompt.append({'text': part.content}) elif isinstance(part, UserPromptPart): bedrock_messages.extend(await self._map_user_prompt(part, document_count)) elif isinstance(part, ToolReturnPart): assert part.tool_call_id is not None bedrock_messages.append( { 'role': 'user', 'content': [ { 'toolResult': { 'toolUseId': part.tool_call_id, 'content': [ {'text': part.model_response_str()} if profile.bedrock_tool_result_format == 'text' else {'json': part.model_response_object()} ], 'status': 'success', } } ], } ) elif isinstance(part, RetryPromptPart): # TODO(Marcelo): We need to add a test here. if part.tool_name is None: # pragma: no cover bedrock_messages.append({'role': 'user', 'content': [{'text': part.model_response()}]}) else: assert part.tool_call_id is not None bedrock_messages.append( { 'role': 'user', 'content': [ { 'toolResult': { 'toolUseId': part.tool_call_id, 'content': [{'text': part.model_response()}], 'status': 'error', } } ], } ) elif isinstance(message, ModelResponse): content: list[ContentBlockOutputTypeDef] = [] for item in message.parts: if isinstance(item, TextPart): content.append({'text': item.content}) elif isinstance(item, ThinkingPart): if ( item.provider_name == self.system and item.signature and BedrockModelProfile.from_profile(self.profile).bedrock_send_back_thinking_parts ): if item.id == 'redacted_content': reasoning_content: ReasoningContentBlockOutputTypeDef = { 'redactedContent': item.signature.encode('utf-8'), } else: reasoning_content: ReasoningContentBlockOutputTypeDef = { 'reasoningText': { 'text': item.content, 'signature': item.signature, } } content.append({'reasoningContent': reasoning_content}) else: start_tag, end_tag = self.profile.thinking_tags content.append({'text': '\n'.join([start_tag, item.content, end_tag])}) elif isinstance(item, BuiltinToolCallPart | BuiltinToolReturnPart): pass else: assert isinstance(item, ToolCallPart) content.append(self._map_tool_call(item)) bedrock_messages.append({'role': 'assistant', 'content': content}) else: assert_never(message) # Merge together sequential user messages. processed_messages: list[MessageUnionTypeDef] = [] last_message: dict[str, Any] | None = None for current_message in bedrock_messages: if ( last_message is not None and current_message['role'] == last_message['role'] and current_message['role'] == 'user' ): # Add the new user content onto the existing user message. last_content = list(last_message['content']) last_content.extend(current_message['content']) last_message['content'] = last_content continue # Add the entire message to the list of messages. processed_messages.append(current_message) last_message = cast(dict[str, Any], current_message) if instructions := self._get_instructions(messages): system_prompt.insert(0, {'text': instructions}) return system_prompt, processed_messages @staticmethod async def _map_user_prompt(part: UserPromptPart, document_count: Iterator[int]) -> list[MessageUnionTypeDef]: content: list[ContentBlockUnionTypeDef] = [] if isinstance(part.content, str): content.append({'text': part.content}) else: for item in part.content: if isinstance(item, str): content.append({'text': item}) elif isinstance(item, BinaryContent): format = item.format if item.is_document: name = f'Document {next(document_count)}' assert format in ('pdf', 'txt', 'csv', 'doc', 'docx', 'xls', 'xlsx', 'html', 'md') content.append({'document': {'name': name, 'format': format, 'source': {'bytes': item.data}}}) elif item.is_image: assert format in ('jpeg', 'png', 'gif', 'webp') content.append({'image': {'format': format, 'source': {'bytes': item.data}}}) elif item.is_video: assert format in ('mkv', 'mov', 'mp4', 'webm', 'flv', 'mpeg', 'mpg', 'wmv', 'three_gp') content.append({'video': {'format': format, 'source': {'bytes': item.data}}}) else: raise NotImplementedError('Binary content is not supported yet.') elif isinstance(item, ImageUrl | DocumentUrl | VideoUrl): downloaded_item = await download_item(item, data_format='bytes', type_format='extension') format = downloaded_item['data_type'] if item.kind == 'image-url': format = item.media_type.split('/')[1] assert format in ('jpeg', 'png', 'gif', 'webp'), f'Unsupported image format: {format}' image: ImageBlockTypeDef = {'format': format, 'source': {'bytes': downloaded_item['data']}} content.append({'image': image}) elif item.kind == 'document-url': name = f'Document {next(document_count)}' document: DocumentBlockTypeDef = { 'name': name, 'format': item.format, 'source': {'bytes': downloaded_item['data']}, } content.append({'document': document}) elif item.kind == 'video-url': # pragma: no branch format = item.media_type.split('/')[1] assert format in ( 'mkv', 'mov', 'mp4', 'webm', 'flv', 'mpeg', 'mpg', 'wmv', 'three_gp', ), f'Unsupported video format: {format}' video: VideoBlockTypeDef = {'format': format, 'source': {'bytes': downloaded_item['data']}} content.append({'video': video}) elif isinstance(item, AudioUrl): # pragma: no cover raise NotImplementedError('Audio is not supported yet.') else: assert_never(item) return [{'role': 'user', 'content': content}] @staticmethod def _map_tool_call(t: ToolCallPart) -> ContentBlockOutputTypeDef: return { 'toolUse': {'toolUseId': _utils.guard_tool_call_id(t=t), 'name': t.tool_name, 'input': t.args_as_dict()} } ``` #### __init__ ```python __init__( model_name: BedrockModelName, *, provider: ( Literal["bedrock"] | Provider[BaseClient] ) = "bedrock", profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None ) ``` Initialize a Bedrock model. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `model_name` | `BedrockModelName` | The name of the model to use. | *required* | | `model_name` | `BedrockModelName` | The name of the Bedrock model to use. List of model names available here. | *required* | | `provider` | `Literal['bedrock'] | Provider[BaseClient]` | The provider to use for authentication and API access. Can be either the string 'bedrock' or an instance of Provider[BaseClient]. If not provided, a new provider will be created using the other parameters. | `'bedrock'` | | `profile` | `ModelProfileSpec | None` | The model profile to use. Defaults to a profile picked by the provider based on the model name. | `None` | | `settings` | `ModelSettings | None` | Model-specific settings that will be used as defaults for this model. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/models/bedrock.py` ```python def __init__( self, model_name: BedrockModelName, *, provider: Literal['bedrock'] | Provider[BaseClient] = 'bedrock', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize a Bedrock model. Args: model_name: The name of the model to use. model_name: The name of the Bedrock model to use. List of model names available [here](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html). provider: The provider to use for authentication and API access. Can be either the string 'bedrock' or an instance of `Provider[BaseClient]`. If not provided, a new provider will be created using the other parameters. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. settings: Model-specific settings that will be used as defaults for this model. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider(provider) self._provider = provider self.client = cast('BedrockRuntimeClient', provider.client) super().__init__(settings=settings, profile=profile or provider.model_profile) ``` #### model_name ```python model_name: str ``` The model name. #### system ```python system: str ``` The model provider. ### BedrockStreamedResponse Bases: `StreamedResponse` Implementation of `StreamedResponse` for Bedrock models. Source code in `pydantic_ai_slim/pydantic_ai/models/bedrock.py` ```python @dataclass class BedrockStreamedResponse(StreamedResponse): """Implementation of `StreamedResponse` for Bedrock models.""" _model_name: BedrockModelName _event_stream: EventStream[ConverseStreamOutputTypeDef] _provider_name: str _timestamp: datetime = field(default_factory=_utils.now_utc) _provider_response_id: str | None = None async def _get_event_iterator(self) -> AsyncIterator[ModelResponseStreamEvent]: # noqa: C901 """Return an async iterator of [`ModelResponseStreamEvent`][pydantic_ai.messages.ModelResponseStreamEvent]s. This method should be implemented by subclasses to translate the vendor-specific stream of events into pydantic_ai-format events. """ if self._provider_response_id is not None: # pragma: no cover self.provider_response_id = self._provider_response_id chunk: ConverseStreamOutputTypeDef tool_id: str | None = None async for chunk in _AsyncIteratorWrapper(self._event_stream): match chunk: case {'messageStart': _}: continue case {'messageStop': message_stop}: raw_finish_reason = message_stop['stopReason'] self.provider_details = {'finish_reason': raw_finish_reason} self.finish_reason = _FINISH_REASON_MAP.get(raw_finish_reason) case {'metadata': metadata}: if 'usage' in metadata: # pragma: no branch self._usage += self._map_usage(metadata) case {'contentBlockStart': content_block_start}: index = content_block_start['contentBlockIndex'] start = content_block_start['start'] if 'toolUse' in start: # pragma: no branch tool_use_start = start['toolUse'] tool_id = tool_use_start['toolUseId'] tool_name = tool_use_start['name'] maybe_event = self._parts_manager.handle_tool_call_delta( vendor_part_id=index, tool_name=tool_name, args=None, tool_call_id=tool_id, ) if maybe_event: # pragma: no branch yield maybe_event case {'contentBlockDelta': content_block_delta}: index = content_block_delta['contentBlockIndex'] delta = content_block_delta['delta'] if 'reasoningContent' in delta: if redacted_content := delta['reasoningContent'].get('redactedContent'): yield self._parts_manager.handle_thinking_delta( vendor_part_id=index, id='redacted_content', signature=redacted_content.decode('utf-8'), provider_name=self.provider_name, ) else: signature = delta['reasoningContent'].get('signature') yield self._parts_manager.handle_thinking_delta( vendor_part_id=index, content=delta['reasoningContent'].get('text'), signature=signature, provider_name=self.provider_name if signature else None, ) if 'text' in delta: maybe_event = self._parts_manager.handle_text_delta(vendor_part_id=index, content=delta['text']) if maybe_event is not None: # pragma: no branch yield maybe_event if 'toolUse' in delta: tool_use = delta['toolUse'] maybe_event = self._parts_manager.handle_tool_call_delta( vendor_part_id=index, tool_name=tool_use.get('name'), args=tool_use.get('input'), tool_call_id=tool_id, ) if maybe_event: # pragma: no branch yield maybe_event case _: pass # pyright wants match statements to be exhaustive @property def model_name(self) -> str: """Get the model name of the response.""" return self._model_name @property def provider_name(self) -> str: """Get the provider name.""" return self._provider_name @property def timestamp(self) -> datetime: return self._timestamp def _map_usage(self, metadata: ConverseStreamMetadataEventTypeDef) -> usage.RequestUsage: return usage.RequestUsage( input_tokens=metadata['usage']['inputTokens'], output_tokens=metadata['usage']['outputTokens'], ) ``` #### model_name ```python model_name: str ``` Get the model name of the response. #### provider_name ```python provider_name: str ``` Get the provider name. # `pydantic_ai.models.cohere` ## Setup For details on how to set up authentication with this model, see [model configuration for Cohere](../../../models/cohere/). ### LatestCohereModelNames ```python LatestCohereModelNames = Literal[ "c4ai-aya-expanse-32b", "c4ai-aya-expanse-8b", "command", "command-light", "command-light-nightly", "command-nightly", "command-r", "command-r-03-2024", "command-r-08-2024", "command-r-plus", "command-r-plus-04-2024", "command-r-plus-08-2024", "command-r7b-12-2024", ] ``` Latest Cohere models. ### CohereModelName ```python CohereModelName = str | LatestCohereModelNames ``` Possible Cohere model names. Since Cohere supports a variety of date-stamped models, we explicitly list the latest models but allow any name in the type hints. See [Cohere's docs](https://docs.cohere.com/v2/docs/models) for a list of all available models. ### CohereModelSettings Bases: `ModelSettings` Settings used for a Cohere model request. Source code in `pydantic_ai_slim/pydantic_ai/models/cohere.py` ```python class CohereModelSettings(ModelSettings, total=False): """Settings used for a Cohere model request.""" ``` ### CohereModel Bases: `Model` A model that uses the Cohere API. Internally, this uses the [Cohere Python client](https://github.com/cohere-ai/cohere-python) to interact with the API. Apart from `__init__`, all methods are private or match those of the base class. Source code in `pydantic_ai_slim/pydantic_ai/models/cohere.py` ```python @dataclass(init=False) class CohereModel(Model): """A model that uses the Cohere API. Internally, this uses the [Cohere Python client]( https://github.com/cohere-ai/cohere-python) to interact with the API. Apart from `__init__`, all methods are private or match those of the base class. """ client: AsyncClientV2 = field(repr=False) _model_name: CohereModelName = field(repr=False) _provider: Provider[AsyncClientV2] = field(repr=False) def __init__( self, model_name: CohereModelName, *, provider: Literal['cohere'] | Provider[AsyncClientV2] = 'cohere', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize an Cohere model. Args: model_name: The name of the Cohere model to use. List of model names available [here](https://docs.cohere.com/docs/models#command). provider: The provider to use for authentication and API access. Can be either the string 'cohere' or an instance of `Provider[AsyncClientV2]`. If not provided, a new provider will be created using the other parameters. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. settings: Model-specific settings that will be used as defaults for this model. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider(provider) self._provider = provider self.client = provider.client super().__init__(settings=settings, profile=profile or provider.model_profile) @property def base_url(self) -> str: client_wrapper = self.client._client_wrapper # type: ignore return str(client_wrapper.get_base_url()) @property def model_name(self) -> CohereModelName: """The model name.""" return self._model_name @property def system(self) -> str: """The model provider.""" return self._provider.name async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: check_allow_model_requests() response = await self._chat(messages, cast(CohereModelSettings, model_settings or {}), model_request_parameters) model_response = self._process_response(response) return model_response async def _chat( self, messages: list[ModelMessage], model_settings: CohereModelSettings, model_request_parameters: ModelRequestParameters, ) -> V2ChatResponse: tools = self._get_tools(model_request_parameters) if model_request_parameters.builtin_tools: raise UserError('Cohere does not support built-in tools') cohere_messages = self._map_messages(messages) try: return await self.client.chat( model=self._model_name, messages=cohere_messages, tools=tools or OMIT, max_tokens=model_settings.get('max_tokens', OMIT), stop_sequences=model_settings.get('stop_sequences', OMIT), temperature=model_settings.get('temperature', OMIT), p=model_settings.get('top_p', OMIT), seed=model_settings.get('seed', OMIT), presence_penalty=model_settings.get('presence_penalty', OMIT), frequency_penalty=model_settings.get('frequency_penalty', OMIT), ) except ApiError as e: if (status_code := e.status_code) and status_code >= 400: raise ModelHTTPError(status_code=status_code, model_name=self.model_name, body=e.body) from e raise # pragma: lax no cover def _process_response(self, response: V2ChatResponse) -> ModelResponse: """Process a non-streamed response, and prepare a message to return.""" parts: list[ModelResponsePart] = [] if response.message.content is not None: for content in response.message.content: if content.type == 'text': parts.append(TextPart(content=content.text)) elif content.type == 'thinking': # pragma: no branch parts.append(ThinkingPart(content=content.thinking)) for c in response.message.tool_calls or []: if c.function and c.function.name and c.function.arguments: # pragma: no branch parts.append( ToolCallPart( tool_name=c.function.name, args=c.function.arguments, tool_call_id=c.id or _generate_tool_call_id(), ) ) raw_finish_reason = response.finish_reason provider_details = {'finish_reason': raw_finish_reason} finish_reason = _FINISH_REASON_MAP.get(raw_finish_reason) return ModelResponse( parts=parts, usage=_map_usage(response), model_name=self._model_name, provider_name=self._provider.name, finish_reason=finish_reason, provider_details=provider_details, ) def _map_messages(self, messages: list[ModelMessage]) -> list[ChatMessageV2]: """Just maps a `pydantic_ai.Message` to a `cohere.ChatMessageV2`.""" cohere_messages: list[ChatMessageV2] = [] for message in messages: if isinstance(message, ModelRequest): cohere_messages.extend(self._map_user_message(message)) elif isinstance(message, ModelResponse): texts: list[str] = [] thinking: list[str] = [] tool_calls: list[ToolCallV2] = [] for item in message.parts: if isinstance(item, TextPart): texts.append(item.content) elif isinstance(item, ThinkingPart): thinking.append(item.content) elif isinstance(item, ToolCallPart): tool_calls.append(self._map_tool_call(item)) elif isinstance(item, BuiltinToolCallPart | BuiltinToolReturnPart): # pragma: no cover # This is currently never returned from cohere pass else: assert_never(item) message_param = AssistantChatMessageV2(role='assistant') if texts or thinking: contents: list[AssistantMessageV2ContentItem] = [] if thinking: contents.append(ThinkingAssistantMessageV2ContentItem(thinking='\n\n'.join(thinking))) if texts: # pragma: no branch contents.append(TextAssistantMessageV2ContentItem(text='\n\n'.join(texts))) message_param.content = contents if tool_calls: message_param.tool_calls = tool_calls cohere_messages.append(message_param) else: assert_never(message) if instructions := self._get_instructions(messages): cohere_messages.insert(0, SystemChatMessageV2(role='system', content=instructions)) return cohere_messages def _get_tools(self, model_request_parameters: ModelRequestParameters) -> list[ToolV2]: return [self._map_tool_definition(r) for r in model_request_parameters.tool_defs.values()] @staticmethod def _map_tool_call(t: ToolCallPart) -> ToolCallV2: return ToolCallV2( id=_guard_tool_call_id(t=t), type='function', function=ToolCallV2Function( name=t.tool_name, arguments=t.args_as_json_str(), ), ) @staticmethod def _map_tool_definition(f: ToolDefinition) -> ToolV2: return ToolV2( type='function', function=ToolV2Function( name=f.name, description=f.description, parameters=f.parameters_json_schema, ), ) @classmethod def _map_user_message(cls, message: ModelRequest) -> Iterable[ChatMessageV2]: for part in message.parts: if isinstance(part, SystemPromptPart): yield SystemChatMessageV2(role='system', content=part.content) elif isinstance(part, UserPromptPart): if isinstance(part.content, str): yield UserChatMessageV2(role='user', content=part.content) else: raise RuntimeError('Cohere does not yet support multi-modal inputs.') elif isinstance(part, ToolReturnPart): yield ToolChatMessageV2( role='tool', tool_call_id=_guard_tool_call_id(t=part), content=part.model_response_str(), ) elif isinstance(part, RetryPromptPart): if part.tool_name is None: yield UserChatMessageV2(role='user', content=part.model_response()) # pragma: no cover else: yield ToolChatMessageV2( role='tool', tool_call_id=_guard_tool_call_id(t=part), content=part.model_response(), ) else: assert_never(part) ``` #### __init__ ```python __init__( model_name: CohereModelName, *, provider: ( Literal["cohere"] | Provider[AsyncClientV2] ) = "cohere", profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None ) ``` Initialize an Cohere model. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `model_name` | `CohereModelName` | The name of the Cohere model to use. List of model names available here. | *required* | | `provider` | `Literal['cohere'] | Provider[AsyncClientV2]` | The provider to use for authentication and API access. Can be either the string 'cohere' or an instance of Provider[AsyncClientV2]. If not provided, a new provider will be created using the other parameters. | `'cohere'` | | `profile` | `ModelProfileSpec | None` | The model profile to use. Defaults to a profile picked by the provider based on the model name. | `None` | | `settings` | `ModelSettings | None` | Model-specific settings that will be used as defaults for this model. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/models/cohere.py` ```python def __init__( self, model_name: CohereModelName, *, provider: Literal['cohere'] | Provider[AsyncClientV2] = 'cohere', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize an Cohere model. Args: model_name: The name of the Cohere model to use. List of model names available [here](https://docs.cohere.com/docs/models#command). provider: The provider to use for authentication and API access. Can be either the string 'cohere' or an instance of `Provider[AsyncClientV2]`. If not provided, a new provider will be created using the other parameters. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. settings: Model-specific settings that will be used as defaults for this model. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider(provider) self._provider = provider self.client = provider.client super().__init__(settings=settings, profile=profile or provider.model_profile) ``` #### model_name ```python model_name: CohereModelName ``` The model name. #### system ```python system: str ``` The model provider. # pydantic_ai.models.fallback ### FallbackModel Bases: `Model` A model that uses one or more fallback models upon failure. Apart from `__init__`, all methods are private or match those of the base class. Source code in `pydantic_ai_slim/pydantic_ai/models/fallback.py` ```python @dataclass(init=False) class FallbackModel(Model): """A model that uses one or more fallback models upon failure. Apart from `__init__`, all methods are private or match those of the base class. """ models: list[Model] _model_name: str = field(repr=False) _fallback_on: Callable[[Exception], bool] def __init__( self, default_model: Model | KnownModelName | str, *fallback_models: Model | KnownModelName | str, fallback_on: Callable[[Exception], bool] | tuple[type[Exception], ...] = (ModelHTTPError,), ): """Initialize a fallback model instance. Args: default_model: The name or instance of the default model to use. fallback_models: The names or instances of the fallback models to use upon failure. fallback_on: A callable or tuple of exceptions that should trigger a fallback. """ super().__init__() self.models = [infer_model(default_model), *[infer_model(m) for m in fallback_models]] if isinstance(fallback_on, tuple): self._fallback_on = _default_fallback_condition_factory(fallback_on) else: self._fallback_on = fallback_on @property def model_name(self) -> str: """The model name.""" return f'fallback:{",".join(model.model_name for model in self.models)}' @property def system(self) -> str: return f'fallback:{",".join(model.system for model in self.models)}' @property def base_url(self) -> str | None: return self.models[0].base_url async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: """Try each model in sequence until one succeeds. In case of failure, raise a FallbackExceptionGroup with all exceptions. """ exceptions: list[Exception] = [] for model in self.models: customized_model_request_parameters = model.customize_request_parameters(model_request_parameters) merged_settings = merge_model_settings(model.settings, model_settings) try: response = await model.request(messages, merged_settings, customized_model_request_parameters) except Exception as exc: if self._fallback_on(exc): exceptions.append(exc) continue raise exc self._set_span_attributes(model) return response raise FallbackExceptionGroup('All models from FallbackModel failed', exceptions) @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: """Try each model in sequence until one succeeds.""" exceptions: list[Exception] = [] for model in self.models: customized_model_request_parameters = model.customize_request_parameters(model_request_parameters) merged_settings = merge_model_settings(model.settings, model_settings) async with AsyncExitStack() as stack: try: response = await stack.enter_async_context( model.request_stream( messages, merged_settings, customized_model_request_parameters, run_context ) ) except Exception as exc: if self._fallback_on(exc): exceptions.append(exc) continue raise exc # pragma: no cover self._set_span_attributes(model) yield response return raise FallbackExceptionGroup('All models from FallbackModel failed', exceptions) def _set_span_attributes(self, model: Model): with suppress(Exception): span = get_current_span() if span.is_recording(): attributes = getattr(span, 'attributes', {}) if attributes.get('gen_ai.request.model') == self.model_name: # pragma: no branch span.set_attributes(InstrumentedModel.model_attributes(model)) ``` #### __init__ ```python __init__( default_model: Model | KnownModelName | str, *fallback_models: Model | KnownModelName | str, fallback_on: ( Callable[[Exception], bool] | tuple[type[Exception], ...] ) = (ModelHTTPError,) ) ``` Initialize a fallback model instance. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `default_model` | `Model | KnownModelName | str` | The name or instance of the default model to use. | *required* | | `fallback_models` | `Model | KnownModelName | str` | The names or instances of the fallback models to use upon failure. | `()` | | `fallback_on` | `Callable[[Exception], bool] | tuple[type[Exception], ...]` | A callable or tuple of exceptions that should trigger a fallback. | `(ModelHTTPError,)` | Source code in `pydantic_ai_slim/pydantic_ai/models/fallback.py` ```python def __init__( self, default_model: Model | KnownModelName | str, *fallback_models: Model | KnownModelName | str, fallback_on: Callable[[Exception], bool] | tuple[type[Exception], ...] = (ModelHTTPError,), ): """Initialize a fallback model instance. Args: default_model: The name or instance of the default model to use. fallback_models: The names or instances of the fallback models to use upon failure. fallback_on: A callable or tuple of exceptions that should trigger a fallback. """ super().__init__() self.models = [infer_model(default_model), *[infer_model(m) for m in fallback_models]] if isinstance(fallback_on, tuple): self._fallback_on = _default_fallback_condition_factory(fallback_on) else: self._fallback_on = fallback_on ``` #### model_name ```python model_name: str ``` The model name. #### request ```python request( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse ``` Try each model in sequence until one succeeds. In case of failure, raise a FallbackExceptionGroup with all exceptions. Source code in `pydantic_ai_slim/pydantic_ai/models/fallback.py` ```python async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: """Try each model in sequence until one succeeds. In case of failure, raise a FallbackExceptionGroup with all exceptions. """ exceptions: list[Exception] = [] for model in self.models: customized_model_request_parameters = model.customize_request_parameters(model_request_parameters) merged_settings = merge_model_settings(model.settings, model_settings) try: response = await model.request(messages, merged_settings, customized_model_request_parameters) except Exception as exc: if self._fallback_on(exc): exceptions.append(exc) continue raise exc self._set_span_attributes(model) return response raise FallbackExceptionGroup('All models from FallbackModel failed', exceptions) ``` #### request_stream ```python request_stream( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse] ``` Try each model in sequence until one succeeds. Source code in `pydantic_ai_slim/pydantic_ai/models/fallback.py` ```python @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: """Try each model in sequence until one succeeds.""" exceptions: list[Exception] = [] for model in self.models: customized_model_request_parameters = model.customize_request_parameters(model_request_parameters) merged_settings = merge_model_settings(model.settings, model_settings) async with AsyncExitStack() as stack: try: response = await stack.enter_async_context( model.request_stream( messages, merged_settings, customized_model_request_parameters, run_context ) ) except Exception as exc: if self._fallback_on(exc): exceptions.append(exc) continue raise exc # pragma: no cover self._set_span_attributes(model) yield response return raise FallbackExceptionGroup('All models from FallbackModel failed', exceptions) ``` # `pydantic_ai.models.function` A model controlled by a local function. FunctionModel is similar to [`TestModel`](../test/), but allows greater control over the model's behavior. Its primary use case is for more advanced unit testing than is possible with `TestModel`. Here's a minimal example: function_model_usage.py ```py from pydantic_ai import Agent from pydantic_ai.messages import ModelMessage, ModelResponse, TextPart from pydantic_ai.models.function import FunctionModel, AgentInfo my_agent = Agent('openai:gpt-4o') async def model_function( messages: list[ModelMessage], info: AgentInfo ) -> ModelResponse: print(messages) """ [ ModelRequest( parts=[ UserPromptPart( content='Testing my agent...', timestamp=datetime.datetime(...), ) ] ) ] """ print(info) """ AgentInfo( function_tools=[], allow_text_output=True, output_tools=[], model_settings=None ) """ return ModelResponse(parts=[TextPart('hello world')]) async def test_my_agent(): """Unit test for my_agent, to be run by pytest.""" with my_agent.override(model=FunctionModel(model_function)): result = await my_agent.run('Testing my agent...') assert result.output == 'hello world' ``` See [Unit testing with `FunctionModel`](../../../testing/#unit-testing-with-functionmodel) for detailed documentation. ### FunctionModel Bases: `Model` A model controlled by a local function. Apart from `__init__`, all methods are private or match those of the base class. Source code in `pydantic_ai_slim/pydantic_ai/models/function.py` ```python @dataclass(init=False) class FunctionModel(Model): """A model controlled by a local function. Apart from `__init__`, all methods are private or match those of the base class. """ function: FunctionDef | None stream_function: StreamFunctionDef | None _model_name: str = field(repr=False) _system: str = field(default='function', repr=False) @overload def __init__( self, function: FunctionDef, *, model_name: str | None = None, profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ) -> None: ... @overload def __init__( self, *, stream_function: StreamFunctionDef, model_name: str | None = None, profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ) -> None: ... @overload def __init__( self, function: FunctionDef, *, stream_function: StreamFunctionDef, model_name: str | None = None, profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ) -> None: ... def __init__( self, function: FunctionDef | None = None, *, stream_function: StreamFunctionDef | None = None, model_name: str | None = None, profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize a `FunctionModel`. Either `function` or `stream_function` must be provided, providing both is allowed. Args: function: The function to call for non-streamed requests. stream_function: The function to call for streamed requests. model_name: The name of the model. If not provided, a name is generated from the function names. profile: The model profile to use. settings: Model-specific settings that will be used as defaults for this model. """ if function is None and stream_function is None: raise TypeError('Either `function` or `stream_function` must be provided') self.function = function self.stream_function = stream_function function_name = self.function.__name__ if self.function is not None else '' stream_function_name = self.stream_function.__name__ if self.stream_function is not None else '' self._model_name = model_name or f'function:{function_name}:{stream_function_name}' # Use a default profile that supports JSON schema and object output if none provided if profile is None: profile = ModelProfile( supports_json_schema_output=True, supports_json_object_output=True, ) super().__init__(settings=settings, profile=profile) async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: agent_info = AgentInfo( function_tools=model_request_parameters.function_tools, allow_text_output=model_request_parameters.allow_text_output, output_tools=model_request_parameters.output_tools, model_settings=model_settings, ) assert self.function is not None, 'FunctionModel must receive a `function` to support non-streamed requests' if inspect.iscoroutinefunction(self.function): response = await self.function(messages, agent_info) else: response_ = await _utils.run_in_executor(self.function, messages, agent_info) assert isinstance(response_, ModelResponse), response_ response = response_ response.model_name = self._model_name # Add usage data if not already present if not response.usage.has_values(): # pragma: no branch response.usage = _estimate_usage(chain(messages, [response])) return response @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: agent_info = AgentInfo( function_tools=model_request_parameters.function_tools, allow_text_output=model_request_parameters.allow_text_output, output_tools=model_request_parameters.output_tools, model_settings=model_settings, ) assert self.stream_function is not None, ( 'FunctionModel must receive a `stream_function` to support streamed requests' ) response_stream = PeekableAsyncStream(self.stream_function(messages, agent_info)) first = await response_stream.peek() if isinstance(first, _utils.Unset): raise ValueError('Stream function must return at least one item') yield FunctionStreamedResponse( model_request_parameters=model_request_parameters, _model_name=self._model_name, _iter=response_stream, ) @property def model_name(self) -> str: """The model name.""" return self._model_name @property def system(self) -> str: """The system / model provider.""" return self._system ``` #### __init__ ```python __init__( function: FunctionDef, *, model_name: str | None = None, profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None ) -> None ``` ```python __init__( *, stream_function: StreamFunctionDef, model_name: str | None = None, profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None ) -> None ``` ```python __init__( function: FunctionDef, *, stream_function: StreamFunctionDef, model_name: str | None = None, profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None ) -> None ``` ```python __init__( function: FunctionDef | None = None, *, stream_function: StreamFunctionDef | None = None, model_name: str | None = None, profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None ) ``` Initialize a `FunctionModel`. Either `function` or `stream_function` must be provided, providing both is allowed. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `function` | `FunctionDef | None` | The function to call for non-streamed requests. | `None` | | `stream_function` | `StreamFunctionDef | None` | The function to call for streamed requests. | `None` | | `model_name` | `str | None` | The name of the model. If not provided, a name is generated from the function names. | `None` | | `profile` | `ModelProfileSpec | None` | The model profile to use. | `None` | | `settings` | `ModelSettings | None` | Model-specific settings that will be used as defaults for this model. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/models/function.py` ```python def __init__( self, function: FunctionDef | None = None, *, stream_function: StreamFunctionDef | None = None, model_name: str | None = None, profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize a `FunctionModel`. Either `function` or `stream_function` must be provided, providing both is allowed. Args: function: The function to call for non-streamed requests. stream_function: The function to call for streamed requests. model_name: The name of the model. If not provided, a name is generated from the function names. profile: The model profile to use. settings: Model-specific settings that will be used as defaults for this model. """ if function is None and stream_function is None: raise TypeError('Either `function` or `stream_function` must be provided') self.function = function self.stream_function = stream_function function_name = self.function.__name__ if self.function is not None else '' stream_function_name = self.stream_function.__name__ if self.stream_function is not None else '' self._model_name = model_name or f'function:{function_name}:{stream_function_name}' # Use a default profile that supports JSON schema and object output if none provided if profile is None: profile = ModelProfile( supports_json_schema_output=True, supports_json_object_output=True, ) super().__init__(settings=settings, profile=profile) ``` #### model_name ```python model_name: str ``` The model name. #### system ```python system: str ``` The system / model provider. ### AgentInfo Information about an agent. This is passed as the second to functions used within FunctionModel. Source code in `pydantic_ai_slim/pydantic_ai/models/function.py` ```python @dataclass(frozen=True, kw_only=True) class AgentInfo: """Information about an agent. This is passed as the second to functions used within [`FunctionModel`][pydantic_ai.models.function.FunctionModel]. """ function_tools: list[ToolDefinition] """The function tools available on this agent. These are the tools registered via the [`tool`][pydantic_ai.Agent.tool] and [`tool_plain`][pydantic_ai.Agent.tool_plain] decorators. """ allow_text_output: bool """Whether a plain text output is allowed.""" output_tools: list[ToolDefinition] """The tools that can called to produce the final output of the run.""" model_settings: ModelSettings | None """The model settings passed to the run call.""" ``` #### function_tools ```python function_tools: list[ToolDefinition] ``` The function tools available on this agent. These are the tools registered via the tool and tool_plain decorators. #### allow_text_output ```python allow_text_output: bool ``` Whether a plain text output is allowed. #### output_tools ```python output_tools: list[ToolDefinition] ``` The tools that can called to produce the final output of the run. #### model_settings ```python model_settings: ModelSettings | None ``` The model settings passed to the run call. ### DeltaToolCall Incremental change to a tool call. Used to describe a chunk when streaming structured responses. Source code in `pydantic_ai_slim/pydantic_ai/models/function.py` ```python @dataclass class DeltaToolCall: """Incremental change to a tool call. Used to describe a chunk when streaming structured responses. """ name: str | None = None """Incremental change to the name of the tool.""" json_args: str | None = None """Incremental change to the arguments as JSON""" _: KW_ONLY tool_call_id: str | None = None """Incremental change to the tool call ID.""" ``` #### name ```python name: str | None = None ``` Incremental change to the name of the tool. #### json_args ```python json_args: str | None = None ``` Incremental change to the arguments as JSON #### tool_call_id ```python tool_call_id: str | None = None ``` Incremental change to the tool call ID. ### DeltaThinkingPart Incremental change to a thinking part. Used to describe a chunk when streaming thinking responses. Source code in `pydantic_ai_slim/pydantic_ai/models/function.py` ```python @dataclass(kw_only=True) class DeltaThinkingPart: """Incremental change to a thinking part. Used to describe a chunk when streaming thinking responses. """ content: str | None = None """Incremental change to the thinking content.""" signature: str | None = None """Incremental change to the thinking signature.""" ``` #### content ```python content: str | None = None ``` Incremental change to the thinking content. #### signature ```python signature: str | None = None ``` Incremental change to the thinking signature. ### DeltaToolCalls ```python DeltaToolCalls: TypeAlias = dict[int, DeltaToolCall] ``` A mapping of tool call IDs to incremental changes. ### DeltaThinkingCalls ```python DeltaThinkingCalls: TypeAlias = dict[int, DeltaThinkingPart] ``` A mapping of thinking call IDs to incremental changes. ### FunctionDef ```python FunctionDef: TypeAlias = Callable[ [list[ModelMessage], AgentInfo], ModelResponse | Awaitable[ModelResponse], ] ``` A function used to generate a non-streamed response. ### StreamFunctionDef ```python StreamFunctionDef: TypeAlias = Callable[ [list[ModelMessage], AgentInfo], AsyncIterator[ str | DeltaToolCalls | DeltaThinkingCalls ], ] ``` A function used to generate a streamed response. While this is defined as having return type of `AsyncIterator[str | DeltaToolCalls | DeltaThinkingCalls]`, it should really be considered as `AsyncIterator[str] | AsyncIterator[DeltaToolCalls] | AsyncIterator[DeltaThinkingCalls]`, E.g. you need to yield all text, all `DeltaToolCalls`, or all `DeltaThinkingCalls`, not mix them. ### FunctionStreamedResponse Bases: `StreamedResponse` Implementation of `StreamedResponse` for FunctionModel. Source code in `pydantic_ai_slim/pydantic_ai/models/function.py` ```python @dataclass class FunctionStreamedResponse(StreamedResponse): """Implementation of `StreamedResponse` for [FunctionModel][pydantic_ai.models.function.FunctionModel].""" _model_name: str _iter: AsyncIterator[str | DeltaToolCalls | DeltaThinkingCalls] _timestamp: datetime = field(default_factory=_utils.now_utc) def __post_init__(self): self._usage += _estimate_usage([]) async def _get_event_iterator(self) -> AsyncIterator[ModelResponseStreamEvent]: async for item in self._iter: if isinstance(item, str): response_tokens = _estimate_string_tokens(item) self._usage += usage.RequestUsage(output_tokens=response_tokens) maybe_event = self._parts_manager.handle_text_delta(vendor_part_id='content', content=item) if maybe_event is not None: # pragma: no branch yield maybe_event elif isinstance(item, dict) and item: for dtc_index, delta in item.items(): if isinstance(delta, DeltaThinkingPart): if delta.content: # pragma: no branch response_tokens = _estimate_string_tokens(delta.content) self._usage += usage.RequestUsage(output_tokens=response_tokens) yield self._parts_manager.handle_thinking_delta( vendor_part_id=dtc_index, content=delta.content, signature=delta.signature, provider_name='function' if delta.signature else None, ) elif isinstance(delta, DeltaToolCall): if delta.json_args: response_tokens = _estimate_string_tokens(delta.json_args) self._usage += usage.RequestUsage(output_tokens=response_tokens) maybe_event = self._parts_manager.handle_tool_call_delta( vendor_part_id=dtc_index, tool_name=delta.name, args=delta.json_args, tool_call_id=delta.tool_call_id, ) if maybe_event is not None: # pragma: no branch yield maybe_event else: assert_never(delta) @property def model_name(self) -> str: """Get the model name of the response.""" return self._model_name @property def provider_name(self) -> None: """Get the provider name.""" return None @property def timestamp(self) -> datetime: """Get the timestamp of the response.""" return self._timestamp ``` #### model_name ```python model_name: str ``` Get the model name of the response. #### provider_name ```python provider_name: None ``` Get the provider name. #### timestamp ```python timestamp: datetime ``` Get the timestamp of the response. # `pydantic_ai.models.google` Interface that uses the [`google-genai`](https://pypi.org/project/google-genai/) package under the hood to access Google's Gemini models via both the Generative Language API and Vertex AI. ## Setup For details on how to set up authentication with this model, see [model configuration for Google](../../../models/google/). ### LatestGoogleModelNames ```python LatestGoogleModelNames = Literal[ "gemini-2.0-flash", "gemini-2.0-flash-lite", "gemini-2.5-flash", "gemini-2.5-flash-lite", "gemini-2.5-pro", ] ``` Latest Gemini models. ### GoogleModelName ```python GoogleModelName = str | LatestGoogleModelNames ``` Possible Gemini model names. Since Gemini supports a variety of date-stamped models, we explicitly list the latest models but allow any name in the type hints. See [the Gemini API docs](https://ai.google.dev/gemini-api/docs/models/gemini#model-variations) for a full list. ### GoogleModelSettings Bases: `ModelSettings` Settings used for a Gemini model request. Source code in `pydantic_ai_slim/pydantic_ai/models/google.py` ```python class GoogleModelSettings(ModelSettings, total=False): """Settings used for a Gemini model request.""" # ALL FIELDS MUST BE `gemini_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. google_safety_settings: list[SafetySettingDict] """The safety settings to use for the model. See for more information. """ google_thinking_config: ThinkingConfigDict """The thinking configuration to use for the model. See for more information. """ google_labels: dict[str, str] """User-defined metadata to break down billed charges. Only supported by the Vertex AI API. See the [Gemini API docs](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/add-labels-to-api-calls) for use cases and limitations. """ google_video_resolution: MediaResolution """The video resolution to use for the model. See for more information. """ google_cached_content: str """The name of the cached content to use for the model. See for more information. """ ``` #### google_safety_settings ```python google_safety_settings: list[SafetySettingDict] ``` The safety settings to use for the model. See for more information. #### google_thinking_config ```python google_thinking_config: ThinkingConfigDict ``` The thinking configuration to use for the model. See for more information. #### google_labels ```python google_labels: dict[str, str] ``` User-defined metadata to break down billed charges. Only supported by the Vertex AI API. See the [Gemini API docs](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/add-labels-to-api-calls) for use cases and limitations. #### google_video_resolution ```python google_video_resolution: MediaResolution ``` The video resolution to use for the model. See for more information. #### google_cached_content ```python google_cached_content: str ``` The name of the cached content to use for the model. See for more information. ### GoogleModel Bases: `Model` A model that uses Gemini via `generativelanguage.googleapis.com` API. This is implemented from scratch rather than using a dedicated SDK, good API documentation is available [here](https://ai.google.dev/api). Apart from `__init__`, all methods are private or match those of the base class. Source code in `pydantic_ai_slim/pydantic_ai/models/google.py` ```python @dataclass(init=False) class GoogleModel(Model): """A model that uses Gemini via `generativelanguage.googleapis.com` API. This is implemented from scratch rather than using a dedicated SDK, good API documentation is available [here](https://ai.google.dev/api). Apart from `__init__`, all methods are private or match those of the base class. """ client: Client = field(repr=False) _model_name: GoogleModelName = field(repr=False) _provider: Provider[Client] = field(repr=False) _url: str | None = field(repr=False) def __init__( self, model_name: GoogleModelName, *, provider: Literal['google-gla', 'google-vertex'] | Provider[Client] = 'google-gla', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize a Gemini model. Args: model_name: The name of the model to use. provider: The provider to use for authentication and API access. Can be either the string 'google-gla' or 'google-vertex' or an instance of `Provider[httpx.AsyncClient]`. If not provided, a new provider will be created using the other parameters. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. settings: The model settings to use. Defaults to None. """ self._model_name = model_name if isinstance(provider, str): provider = GoogleProvider(vertexai=provider == 'google-vertex') self._provider = provider self.client = provider.client super().__init__(settings=settings, profile=profile or provider.model_profile) @property def base_url(self) -> str: return self._provider.base_url @property def model_name(self) -> GoogleModelName: """The model name.""" return self._model_name @property def system(self) -> str: """The model provider.""" return self._provider.name async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: check_allow_model_requests() model_settings = cast(GoogleModelSettings, model_settings or {}) response = await self._generate_content(messages, False, model_settings, model_request_parameters) return self._process_response(response) async def count_tokens( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> usage.RequestUsage: check_allow_model_requests() model_settings = cast(GoogleModelSettings, model_settings or {}) contents, generation_config = await self._build_content_and_config( messages, model_settings, model_request_parameters ) # Annoyingly, the type of `GenerateContentConfigDict.get` is "partially `Unknown`" because `response_schema` includes `typing._UnionGenericAlias`, # so without this we'd need `pyright: ignore[reportUnknownMemberType]` on every line and wouldn't get type checking anyway. generation_config = cast(dict[str, Any], generation_config) config = CountTokensConfigDict( http_options=generation_config.get('http_options'), ) if self._provider.name != 'google-gla': # The fields are not supported by the Gemini API per https://github.com/googleapis/python-genai/blob/7e4ec284dc6e521949626f3ed54028163ef9121d/google/genai/models.py#L1195-L1214 config.update( # pragma: lax no cover system_instruction=generation_config.get('system_instruction'), tools=cast(list[ToolDict], generation_config.get('tools')), # Annoyingly, GenerationConfigDict has fewer fields than GenerateContentConfigDict, and no extra fields are allowed. generation_config=GenerationConfigDict( temperature=generation_config.get('temperature'), top_p=generation_config.get('top_p'), max_output_tokens=generation_config.get('max_output_tokens'), stop_sequences=generation_config.get('stop_sequences'), presence_penalty=generation_config.get('presence_penalty'), frequency_penalty=generation_config.get('frequency_penalty'), seed=generation_config.get('seed'), thinking_config=generation_config.get('thinking_config'), media_resolution=generation_config.get('media_resolution'), response_mime_type=generation_config.get('response_mime_type'), response_schema=generation_config.get('response_schema'), ), ) response = await self.client.aio.models.count_tokens( model=self._model_name, contents=contents, config=config, ) if response.total_tokens is None: raise UnexpectedModelBehavior( # pragma: no cover 'Total tokens missing from Gemini response', str(response) ) return usage.RequestUsage( input_tokens=response.total_tokens, ) @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: check_allow_model_requests() model_settings = cast(GoogleModelSettings, model_settings or {}) response = await self._generate_content(messages, True, model_settings, model_request_parameters) yield await self._process_streamed_response(response, model_request_parameters) # type: ignore def _get_tools(self, model_request_parameters: ModelRequestParameters) -> list[ToolDict] | None: if model_request_parameters.builtin_tools: if model_request_parameters.output_tools: raise UserError( 'Gemini does not support output tools and built-in tools at the same time. Use `output_type=PromptedOutput(...)` instead.' ) if model_request_parameters.function_tools: raise UserError('Gemini does not support user tools and built-in tools at the same time.') tools: list[ToolDict] = [ ToolDict(function_declarations=[_function_declaration_from_tool(t)]) for t in model_request_parameters.tool_defs.values() ] for tool in model_request_parameters.builtin_tools: if isinstance(tool, WebSearchTool): tools.append(ToolDict(google_search=GoogleSearchDict())) elif isinstance(tool, UrlContextTool): tools.append(ToolDict(url_context=UrlContextDict())) elif isinstance(tool, CodeExecutionTool): # pragma: no branch tools.append(ToolDict(code_execution=ToolCodeExecutionDict())) else: # pragma: no cover raise UserError( f'`{tool.__class__.__name__}` is not supported by `GoogleModel`. If it should be, please file an issue.' ) return tools or None def _get_tool_config( self, model_request_parameters: ModelRequestParameters, tools: list[ToolDict] | None ) -> ToolConfigDict | None: if not model_request_parameters.allow_text_output and tools: names: list[str] = [] for tool in tools: for function_declaration in tool.get('function_declarations') or []: if name := function_declaration.get('name'): # pragma: no branch names.append(name) return _tool_config(names) else: return None @overload async def _generate_content( self, messages: list[ModelMessage], stream: Literal[False], model_settings: GoogleModelSettings, model_request_parameters: ModelRequestParameters, ) -> GenerateContentResponse: ... @overload async def _generate_content( self, messages: list[ModelMessage], stream: Literal[True], model_settings: GoogleModelSettings, model_request_parameters: ModelRequestParameters, ) -> Awaitable[AsyncIterator[GenerateContentResponse]]: ... async def _generate_content( self, messages: list[ModelMessage], stream: bool, model_settings: GoogleModelSettings, model_request_parameters: ModelRequestParameters, ) -> GenerateContentResponse | Awaitable[AsyncIterator[GenerateContentResponse]]: contents, config = await self._build_content_and_config(messages, model_settings, model_request_parameters) func = self.client.aio.models.generate_content_stream if stream else self.client.aio.models.generate_content return await func(model=self._model_name, contents=contents, config=config) # type: ignore async def _build_content_and_config( self, messages: list[ModelMessage], model_settings: GoogleModelSettings, model_request_parameters: ModelRequestParameters, ) -> tuple[list[ContentUnionDict], GenerateContentConfigDict]: tools = self._get_tools(model_request_parameters) response_mime_type = None response_schema = None if model_request_parameters.output_mode == 'native': if tools: raise UserError( 'Gemini does not support `NativeOutput` and tools at the same time. Use `output_type=ToolOutput(...)` instead.' ) response_mime_type = 'application/json' output_object = model_request_parameters.output_object assert output_object is not None response_schema = self._map_response_schema(output_object) elif model_request_parameters.output_mode == 'prompted' and not tools: response_mime_type = 'application/json' tool_config = self._get_tool_config(model_request_parameters, tools) system_instruction, contents = await self._map_messages(messages) http_options: HttpOptionsDict = { 'headers': {'Content-Type': 'application/json', 'User-Agent': get_user_agent()} } if timeout := model_settings.get('timeout'): if isinstance(timeout, int | float): http_options['timeout'] = int(1000 * timeout) else: raise UserError('Google does not support setting ModelSettings.timeout to a httpx.Timeout') config = GenerateContentConfigDict( http_options=http_options, system_instruction=system_instruction, temperature=model_settings.get('temperature'), top_p=model_settings.get('top_p'), max_output_tokens=model_settings.get('max_tokens'), stop_sequences=model_settings.get('stop_sequences'), presence_penalty=model_settings.get('presence_penalty'), frequency_penalty=model_settings.get('frequency_penalty'), seed=model_settings.get('seed'), safety_settings=model_settings.get('google_safety_settings'), thinking_config=model_settings.get('google_thinking_config'), labels=model_settings.get('google_labels'), media_resolution=model_settings.get('google_video_resolution'), cached_content=model_settings.get('google_cached_content'), tools=cast(ToolListUnionDict, tools), tool_config=tool_config, response_mime_type=response_mime_type, response_schema=response_schema, ) return contents, config def _process_response(self, response: GenerateContentResponse) -> ModelResponse: if not response.candidates or len(response.candidates) != 1: raise UnexpectedModelBehavior('Expected exactly one candidate in Gemini response') # pragma: no cover candidate = response.candidates[0] if candidate.content is None or candidate.content.parts is None: if candidate.finish_reason == 'SAFETY': raise UnexpectedModelBehavior('Safety settings triggered', str(response)) else: raise UnexpectedModelBehavior( 'Content field missing from Gemini response', str(response) ) # pragma: no cover parts = candidate.content.parts or [] vendor_id = response.response_id vendor_details: dict[str, Any] | None = None finish_reason: FinishReason | None = None if raw_finish_reason := candidate.finish_reason: # pragma: no branch vendor_details = {'finish_reason': raw_finish_reason.value} finish_reason = _FINISH_REASON_MAP.get(raw_finish_reason) usage = _metadata_as_usage(response) return _process_response_from_parts( parts, response.model_version or self._model_name, self._provider.name, usage, vendor_id=vendor_id, vendor_details=vendor_details, finish_reason=finish_reason, ) async def _process_streamed_response( self, response: AsyncIterator[GenerateContentResponse], model_request_parameters: ModelRequestParameters ) -> StreamedResponse: """Process a streamed response, and prepare a streaming response to return.""" peekable_response = _utils.PeekableAsyncStream(response) first_chunk = await peekable_response.peek() if isinstance(first_chunk, _utils.Unset): raise UnexpectedModelBehavior('Streamed response ended without content or tool calls') # pragma: no cover return GeminiStreamedResponse( model_request_parameters=model_request_parameters, _model_name=first_chunk.model_version or self._model_name, _response=peekable_response, _timestamp=first_chunk.create_time or _utils.now_utc(), _provider_name=self._provider.name, ) async def _map_messages(self, messages: list[ModelMessage]) -> tuple[ContentDict | None, list[ContentUnionDict]]: contents: list[ContentUnionDict] = [] system_parts: list[PartDict] = [] for m in messages: if isinstance(m, ModelRequest): message_parts: list[PartDict] = [] for part in m.parts: if isinstance(part, SystemPromptPart): system_parts.append({'text': part.content}) elif isinstance(part, UserPromptPart): message_parts.extend(await self._map_user_prompt(part)) elif isinstance(part, ToolReturnPart): message_parts.append( { 'function_response': { 'name': part.tool_name, 'response': part.model_response_object(), 'id': part.tool_call_id, } } ) elif isinstance(part, RetryPromptPart): if part.tool_name is None: message_parts.append({'text': part.model_response()}) # pragma: no cover else: message_parts.append( { 'function_response': { 'name': part.tool_name, 'response': {'call_error': part.model_response()}, 'id': part.tool_call_id, } } ) else: assert_never(part) # Google GenAI requires at least one part in the message. if not message_parts: message_parts = [{'text': ''}] contents.append({'role': 'user', 'parts': message_parts}) elif isinstance(m, ModelResponse): contents.append(_content_model_response(m, self.system)) else: assert_never(m) if instructions := self._get_instructions(messages): system_parts.insert(0, {'text': instructions}) system_instruction = ContentDict(role='user', parts=system_parts) if system_parts else None return system_instruction, contents async def _map_user_prompt(self, part: UserPromptPart) -> list[PartDict]: if isinstance(part.content, str): return [{'text': part.content}] else: content: list[PartDict] = [] for item in part.content: if isinstance(item, str): content.append({'text': item}) elif isinstance(item, BinaryContent): # NOTE: The type from Google GenAI is incorrect, it should be `str`, not `bytes`. base64_encoded = base64.b64encode(item.data).decode('utf-8') inline_data_dict = {'inline_data': {'data': base64_encoded, 'mime_type': item.media_type}} if item.vendor_metadata: inline_data_dict['video_metadata'] = item.vendor_metadata content.append(inline_data_dict) # type: ignore elif isinstance(item, VideoUrl) and item.is_youtube: file_data_dict = {'file_data': {'file_uri': item.url, 'mime_type': item.media_type}} if item.vendor_metadata: # pragma: no branch file_data_dict['video_metadata'] = item.vendor_metadata content.append(file_data_dict) # type: ignore elif isinstance(item, FileUrl): if item.force_download or ( # google-gla does not support passing file urls directly, except for youtube videos # (see above) and files uploaded to the file API (which cannot be downloaded anyway) self.system == 'google-gla' and not item.url.startswith(r'https://generativelanguage.googleapis.com/v1beta/files') ): downloaded_item = await download_item(item, data_format='base64') inline_data = {'data': downloaded_item['data'], 'mime_type': downloaded_item['data_type']} content.append({'inline_data': inline_data}) # type: ignore else: content.append( {'file_data': {'file_uri': item.url, 'mime_type': item.media_type}} ) # pragma: lax no cover else: assert_never(item) return content def _map_response_schema(self, o: OutputObjectDefinition) -> dict[str, Any]: response_schema = o.json_schema.copy() if o.name: response_schema['title'] = o.name if o.description: response_schema['description'] = o.description return response_schema ``` #### __init__ ```python __init__( model_name: GoogleModelName, *, provider: ( Literal["google-gla", "google-vertex"] | Provider[Client] ) = "google-gla", profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None ) ``` Initialize a Gemini model. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `model_name` | `GoogleModelName` | The name of the model to use. | *required* | | `provider` | `Literal['google-gla', 'google-vertex'] | Provider[Client]` | The provider to use for authentication and API access. Can be either the string 'google-gla' or 'google-vertex' or an instance of Provider[httpx.AsyncClient]. If not provided, a new provider will be created using the other parameters. | `'google-gla'` | | `profile` | `ModelProfileSpec | None` | The model profile to use. Defaults to a profile picked by the provider based on the model name. | `None` | | `settings` | `ModelSettings | None` | The model settings to use. Defaults to None. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/models/google.py` ```python def __init__( self, model_name: GoogleModelName, *, provider: Literal['google-gla', 'google-vertex'] | Provider[Client] = 'google-gla', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize a Gemini model. Args: model_name: The name of the model to use. provider: The provider to use for authentication and API access. Can be either the string 'google-gla' or 'google-vertex' or an instance of `Provider[httpx.AsyncClient]`. If not provided, a new provider will be created using the other parameters. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. settings: The model settings to use. Defaults to None. """ self._model_name = model_name if isinstance(provider, str): provider = GoogleProvider(vertexai=provider == 'google-vertex') self._provider = provider self.client = provider.client super().__init__(settings=settings, profile=profile or provider.model_profile) ``` #### model_name ```python model_name: GoogleModelName ``` The model name. #### system ```python system: str ``` The model provider. ### GeminiStreamedResponse Bases: `StreamedResponse` Implementation of `StreamedResponse` for the Gemini model. Source code in `pydantic_ai_slim/pydantic_ai/models/google.py` ```python @dataclass class GeminiStreamedResponse(StreamedResponse): """Implementation of `StreamedResponse` for the Gemini model.""" _model_name: GoogleModelName _response: AsyncIterator[GenerateContentResponse] _timestamp: datetime _provider_name: str async def _get_event_iterator(self) -> AsyncIterator[ModelResponseStreamEvent]: # noqa: C901 async for chunk in self._response: self._usage = _metadata_as_usage(chunk) assert chunk.candidates is not None candidate = chunk.candidates[0] if chunk.response_id: # pragma: no branch self.provider_response_id = chunk.response_id if raw_finish_reason := candidate.finish_reason: self.provider_details = {'finish_reason': raw_finish_reason.value} self.finish_reason = _FINISH_REASON_MAP.get(raw_finish_reason) if candidate.content is None or candidate.content.parts is None: if candidate.finish_reason == 'STOP': # pragma: no cover # Normal completion - skip this chunk continue elif candidate.finish_reason == 'SAFETY': # pragma: no cover raise UnexpectedModelBehavior('Safety settings triggered', str(chunk)) else: # pragma: no cover raise UnexpectedModelBehavior('Content field missing from streaming Gemini response', str(chunk)) parts = candidate.content.parts or [] for part in parts: if part.thought_signature: signature = base64.b64encode(part.thought_signature).decode('utf-8') yield self._parts_manager.handle_thinking_delta( vendor_part_id='thinking', signature=signature, provider_name=self.provider_name, ) if part.text is not None: if part.thought: yield self._parts_manager.handle_thinking_delta(vendor_part_id='thinking', content=part.text) else: maybe_event = self._parts_manager.handle_text_delta(vendor_part_id='content', content=part.text) if maybe_event is not None: # pragma: no branch yield maybe_event elif part.function_call: maybe_event = self._parts_manager.handle_tool_call_delta( vendor_part_id=uuid4(), tool_name=part.function_call.name, args=part.function_call.args, tool_call_id=part.function_call.id, ) if maybe_event is not None: # pragma: no branch yield maybe_event elif part.executable_code is not None: pass elif part.code_execution_result is not None: pass else: assert part.function_response is not None, f'Unexpected part: {part}' # pragma: no cover @property def model_name(self) -> GoogleModelName: """Get the model name of the response.""" return self._model_name @property def provider_name(self) -> str: """Get the provider name.""" return self._provider_name @property def timestamp(self) -> datetime: """Get the timestamp of the response.""" return self._timestamp ``` #### model_name ```python model_name: GoogleModelName ``` Get the model name of the response. #### provider_name ```python provider_name: str ``` Get the provider name. #### timestamp ```python timestamp: datetime ``` Get the timestamp of the response. # `pydantic_ai.models.groq` ## Setup For details on how to set up authentication with this model, see [model configuration for Groq](../../../models/groq/). ### ProductionGroqModelNames ```python ProductionGroqModelNames = Literal[ "distil-whisper-large-v3-en", "gemma2-9b-it", "llama-3.3-70b-versatile", "llama-3.1-8b-instant", "llama-guard-3-8b", "llama3-70b-8192", "llama3-8b-8192", "whisper-large-v3", "whisper-large-v3-turbo", ] ``` Production Groq models from . ### PreviewGroqModelNames ```python PreviewGroqModelNames = Literal[ "playai-tts", "playai-tts-arabic", "qwen-qwq-32b", "mistral-saba-24b", "qwen-2.5-coder-32b", "qwen-2.5-32b", "deepseek-r1-distill-qwen-32b", "deepseek-r1-distill-llama-70b", "llama-3.3-70b-specdec", "llama-3.2-1b-preview", "llama-3.2-3b-preview", "llama-3.2-11b-vision-preview", "llama-3.2-90b-vision-preview", "moonshotai/kimi-k2-instruct", ] ``` Preview Groq models from . ### GroqModelName ```python GroqModelName = ( str | ProductionGroqModelNames | PreviewGroqModelNames ) ``` Possible Groq model names. Since Groq supports a variety of models and the list changes frequencly, we explicitly list the named models as of 2025-03-31 but allow any name in the type hints. See for an up to date date list of models and more details. ### GroqModelSettings Bases: `ModelSettings` Settings used for a Groq model request. Source code in `pydantic_ai_slim/pydantic_ai/models/groq.py` ```python class GroqModelSettings(ModelSettings, total=False): """Settings used for a Groq model request.""" # ALL FIELDS MUST BE `groq_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. groq_reasoning_format: Literal['hidden', 'raw', 'parsed'] ``` ### GroqModel Bases: `Model` A model that uses the Groq API. Internally, this uses the [Groq Python client](https://github.com/groq/groq-python) to interact with the API. Apart from `__init__`, all methods are private or match those of the base class. Source code in `pydantic_ai_slim/pydantic_ai/models/groq.py` ```python @dataclass(init=False) class GroqModel(Model): """A model that uses the Groq API. Internally, this uses the [Groq Python client](https://github.com/groq/groq-python) to interact with the API. Apart from `__init__`, all methods are private or match those of the base class. """ client: AsyncGroq = field(repr=False) _model_name: GroqModelName = field(repr=False) _provider: Provider[AsyncGroq] = field(repr=False) def __init__( self, model_name: GroqModelName, *, provider: Literal['groq'] | Provider[AsyncGroq] = 'groq', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize a Groq model. Args: model_name: The name of the Groq model to use. List of model names available [here](https://console.groq.com/docs/models). provider: The provider to use for authentication and API access. Can be either the string 'groq' or an instance of `Provider[AsyncGroq]`. If not provided, a new provider will be created using the other parameters. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. settings: Model-specific settings that will be used as defaults for this model. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider(provider) self._provider = provider self.client = provider.client super().__init__(settings=settings, profile=profile or provider.model_profile) @property def base_url(self) -> str: return str(self.client.base_url) @property def model_name(self) -> GroqModelName: """The model name.""" return self._model_name @property def system(self) -> str: """The model provider.""" return self._provider.name async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: check_allow_model_requests() try: response = await self._completions_create( messages, False, cast(GroqModelSettings, model_settings or {}), model_request_parameters ) except ModelHTTPError as e: if isinstance(e.body, dict): # pragma: no branch # The Groq SDK tries to be helpful by raising an exception when generated tool arguments don't match the schema, # but we'd rather handle it ourselves so we can tell the model to retry the tool call. try: error = _GroqToolUseFailedError.model_validate(e.body) # pyright: ignore[reportUnknownMemberType] tool_call_part = ToolCallPart( tool_name=error.error.failed_generation.name, args=error.error.failed_generation.arguments, ) return ModelResponse( parts=[tool_call_part], model_name=e.model_name, timestamp=_utils.now_utc(), provider_name=self._provider.name, finish_reason='error', ) except ValidationError: pass raise model_response = self._process_response(response) return model_response @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: check_allow_model_requests() response = await self._completions_create( messages, True, cast(GroqModelSettings, model_settings or {}), model_request_parameters ) async with response: yield await self._process_streamed_response(response, model_request_parameters) @overload async def _completions_create( self, messages: list[ModelMessage], stream: Literal[True], model_settings: GroqModelSettings, model_request_parameters: ModelRequestParameters, ) -> AsyncStream[chat.ChatCompletionChunk]: pass @overload async def _completions_create( self, messages: list[ModelMessage], stream: Literal[False], model_settings: GroqModelSettings, model_request_parameters: ModelRequestParameters, ) -> chat.ChatCompletion: pass async def _completions_create( self, messages: list[ModelMessage], stream: bool, model_settings: GroqModelSettings, model_request_parameters: ModelRequestParameters, ) -> chat.ChatCompletion | AsyncStream[chat.ChatCompletionChunk]: tools = self._get_tools(model_request_parameters) tools += self._get_builtin_tools(model_request_parameters) if not tools: tool_choice: Literal['none', 'required', 'auto'] | None = None elif not model_request_parameters.allow_text_output: tool_choice = 'required' else: tool_choice = 'auto' groq_messages = self._map_messages(messages) response_format: chat.completion_create_params.ResponseFormat | None = None if model_request_parameters.output_mode == 'native': output_object = model_request_parameters.output_object assert output_object is not None response_format = self._map_json_schema(output_object) elif ( model_request_parameters.output_mode == 'prompted' and not tools and self.profile.supports_json_object_output ): # pragma: no branch response_format = {'type': 'json_object'} try: extra_headers = model_settings.get('extra_headers', {}) extra_headers.setdefault('User-Agent', get_user_agent()) return await self.client.chat.completions.create( model=self._model_name, messages=groq_messages, n=1, parallel_tool_calls=model_settings.get('parallel_tool_calls', NOT_GIVEN), tools=tools or NOT_GIVEN, tool_choice=tool_choice or NOT_GIVEN, stop=model_settings.get('stop_sequences', NOT_GIVEN), stream=stream, response_format=response_format or NOT_GIVEN, max_tokens=model_settings.get('max_tokens', NOT_GIVEN), temperature=model_settings.get('temperature', NOT_GIVEN), top_p=model_settings.get('top_p', NOT_GIVEN), timeout=model_settings.get('timeout', NOT_GIVEN), seed=model_settings.get('seed', NOT_GIVEN), presence_penalty=model_settings.get('presence_penalty', NOT_GIVEN), reasoning_format=model_settings.get('groq_reasoning_format', NOT_GIVEN), frequency_penalty=model_settings.get('frequency_penalty', NOT_GIVEN), logit_bias=model_settings.get('logit_bias', NOT_GIVEN), extra_headers=extra_headers, extra_body=model_settings.get('extra_body'), ) except APIStatusError as e: if (status_code := e.status_code) >= 400: raise ModelHTTPError(status_code=status_code, model_name=self.model_name, body=e.body) from e raise # pragma: lax no cover def _process_response(self, response: chat.ChatCompletion) -> ModelResponse: """Process a non-streamed response, and prepare a message to return.""" timestamp = number_to_datetime(response.created) choice = response.choices[0] items: list[ModelResponsePart] = [] if choice.message.executed_tools: for tool in choice.message.executed_tools: tool_call_id = generate_tool_call_id() items.append( BuiltinToolCallPart( tool_name=tool.type, args=tool.arguments, provider_name=self.system, tool_call_id=tool_call_id ) ) items.append( BuiltinToolReturnPart( provider_name=self.system, tool_name=tool.type, content=tool.output, tool_call_id=tool_call_id ) ) if choice.message.reasoning is not None: # NOTE: The `reasoning` field is only present if `groq_reasoning_format` is set to `parsed`. items.append(ThinkingPart(content=choice.message.reasoning)) if choice.message.content is not None: # NOTE: The `` tag is only present if `groq_reasoning_format` is set to `raw`. items.extend(split_content_into_text_and_thinking(choice.message.content, self.profile.thinking_tags)) if choice.message.tool_calls is not None: for c in choice.message.tool_calls: items.append(ToolCallPart(tool_name=c.function.name, args=c.function.arguments, tool_call_id=c.id)) raw_finish_reason = choice.finish_reason provider_details = {'finish_reason': raw_finish_reason} finish_reason = _FINISH_REASON_MAP.get(raw_finish_reason) return ModelResponse( parts=items, usage=_map_usage(response), model_name=response.model, timestamp=timestamp, provider_response_id=response.id, provider_name=self._provider.name, finish_reason=finish_reason, provider_details=provider_details, ) async def _process_streamed_response( self, response: AsyncStream[chat.ChatCompletionChunk], model_request_parameters: ModelRequestParameters ) -> GroqStreamedResponse: """Process a streamed response, and prepare a streaming response to return.""" peekable_response = _utils.PeekableAsyncStream(response) first_chunk = await peekable_response.peek() if isinstance(first_chunk, _utils.Unset): raise UnexpectedModelBehavior( # pragma: no cover 'Streamed response ended without content or tool calls' ) return GroqStreamedResponse( model_request_parameters=model_request_parameters, _response=peekable_response, _model_name=first_chunk.model, _model_profile=self.profile, _timestamp=number_to_datetime(first_chunk.created), _provider_name=self._provider.name, ) def _get_tools(self, model_request_parameters: ModelRequestParameters) -> list[chat.ChatCompletionToolParam]: return [self._map_tool_definition(r) for r in model_request_parameters.tool_defs.values()] def _get_builtin_tools( self, model_request_parameters: ModelRequestParameters ) -> list[chat.ChatCompletionToolParam]: tools: list[chat.ChatCompletionToolParam] = [] for tool in model_request_parameters.builtin_tools: if isinstance(tool, WebSearchTool): if not GroqModelProfile.from_profile(self.profile).groq_always_has_web_search_builtin_tool: raise UserError('`WebSearchTool` is not supported by Groq') # pragma: no cover else: raise UserError( f'`{tool.__class__.__name__}` is not supported by `GroqModel`. If it should be, please file an issue.' ) return tools def _map_messages(self, messages: list[ModelMessage]) -> list[chat.ChatCompletionMessageParam]: """Just maps a `pydantic_ai.Message` to a `groq.types.ChatCompletionMessageParam`.""" groq_messages: list[chat.ChatCompletionMessageParam] = [] for message in messages: if isinstance(message, ModelRequest): groq_messages.extend(self._map_user_message(message)) elif isinstance(message, ModelResponse): texts: list[str] = [] tool_calls: list[chat.ChatCompletionMessageToolCallParam] = [] for item in message.parts: if isinstance(item, TextPart): texts.append(item.content) elif isinstance(item, ToolCallPart): tool_calls.append(self._map_tool_call(item)) elif isinstance(item, ThinkingPart): start_tag, end_tag = self.profile.thinking_tags texts.append('\n'.join([start_tag, item.content, end_tag])) elif isinstance(item, BuiltinToolCallPart | BuiltinToolReturnPart): # pragma: no cover # This is currently never returned from groq pass else: assert_never(item) message_param = chat.ChatCompletionAssistantMessageParam(role='assistant') if texts: # Note: model responses from this model should only have one text item, so the following # shouldn't merge multiple texts into one unless you switch models between runs: message_param['content'] = '\n\n'.join(texts) if tool_calls: message_param['tool_calls'] = tool_calls groq_messages.append(message_param) else: assert_never(message) if instructions := self._get_instructions(messages): groq_messages.insert(0, chat.ChatCompletionSystemMessageParam(role='system', content=instructions)) return groq_messages @staticmethod def _map_tool_call(t: ToolCallPart) -> chat.ChatCompletionMessageToolCallParam: return chat.ChatCompletionMessageToolCallParam( id=_guard_tool_call_id(t=t), type='function', function={'name': t.tool_name, 'arguments': t.args_as_json_str()}, ) @staticmethod def _map_tool_definition(f: ToolDefinition) -> chat.ChatCompletionToolParam: return { 'type': 'function', 'function': { 'name': f.name, 'description': f.description or '', 'parameters': f.parameters_json_schema, }, } def _map_json_schema(self, o: OutputObjectDefinition) -> chat.completion_create_params.ResponseFormat: response_format_param: chat.completion_create_params.ResponseFormatResponseFormatJsonSchema = { 'type': 'json_schema', 'json_schema': { 'name': o.name or DEFAULT_OUTPUT_TOOL_NAME, 'schema': o.json_schema, 'strict': o.strict, }, } if o.description: # pragma: no branch response_format_param['json_schema']['description'] = o.description return response_format_param @classmethod def _map_user_message(cls, message: ModelRequest) -> Iterable[chat.ChatCompletionMessageParam]: for part in message.parts: if isinstance(part, SystemPromptPart): yield chat.ChatCompletionSystemMessageParam(role='system', content=part.content) elif isinstance(part, UserPromptPart): yield cls._map_user_prompt(part) elif isinstance(part, ToolReturnPart): yield chat.ChatCompletionToolMessageParam( role='tool', tool_call_id=_guard_tool_call_id(t=part), content=part.model_response_str(), ) elif isinstance(part, RetryPromptPart): # pragma: no branch if part.tool_name is None: yield chat.ChatCompletionUserMessageParam( # pragma: no cover role='user', content=part.model_response() ) else: yield chat.ChatCompletionToolMessageParam( role='tool', tool_call_id=_guard_tool_call_id(t=part), content=part.model_response(), ) @staticmethod def _map_user_prompt(part: UserPromptPart) -> chat.ChatCompletionUserMessageParam: content: str | list[chat.ChatCompletionContentPartParam] if isinstance(part.content, str): content = part.content else: content = [] for item in part.content: if isinstance(item, str): content.append(chat.ChatCompletionContentPartTextParam(text=item, type='text')) elif isinstance(item, ImageUrl): image_url = ImageURL(url=item.url) content.append(chat.ChatCompletionContentPartImageParam(image_url=image_url, type='image_url')) elif isinstance(item, BinaryContent): base64_encoded = base64.b64encode(item.data).decode('utf-8') if item.is_image: image_url = ImageURL(url=f'data:{item.media_type};base64,{base64_encoded}') content.append(chat.ChatCompletionContentPartImageParam(image_url=image_url, type='image_url')) else: raise RuntimeError('Only images are supported for binary content in Groq.') elif isinstance(item, DocumentUrl): # pragma: no cover raise RuntimeError('DocumentUrl is not supported in Groq.') else: # pragma: no cover raise RuntimeError(f'Unsupported content type: {type(item)}') return chat.ChatCompletionUserMessageParam(role='user', content=content) ``` #### __init__ ```python __init__( model_name: GroqModelName, *, provider: ( Literal["groq"] | Provider[AsyncGroq] ) = "groq", profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None ) ``` Initialize a Groq model. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `model_name` | `GroqModelName` | The name of the Groq model to use. List of model names available here. | *required* | | `provider` | `Literal['groq'] | Provider[AsyncGroq]` | The provider to use for authentication and API access. Can be either the string 'groq' or an instance of Provider[AsyncGroq]. If not provided, a new provider will be created using the other parameters. | `'groq'` | | `profile` | `ModelProfileSpec | None` | The model profile to use. Defaults to a profile picked by the provider based on the model name. | `None` | | `settings` | `ModelSettings | None` | Model-specific settings that will be used as defaults for this model. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/models/groq.py` ```python def __init__( self, model_name: GroqModelName, *, provider: Literal['groq'] | Provider[AsyncGroq] = 'groq', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize a Groq model. Args: model_name: The name of the Groq model to use. List of model names available [here](https://console.groq.com/docs/models). provider: The provider to use for authentication and API access. Can be either the string 'groq' or an instance of `Provider[AsyncGroq]`. If not provided, a new provider will be created using the other parameters. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. settings: Model-specific settings that will be used as defaults for this model. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider(provider) self._provider = provider self.client = provider.client super().__init__(settings=settings, profile=profile or provider.model_profile) ``` #### model_name ```python model_name: GroqModelName ``` The model name. #### system ```python system: str ``` The model provider. ### GroqStreamedResponse Bases: `StreamedResponse` Implementation of `StreamedResponse` for Groq models. Source code in `pydantic_ai_slim/pydantic_ai/models/groq.py` ```python @dataclass class GroqStreamedResponse(StreamedResponse): """Implementation of `StreamedResponse` for Groq models.""" _model_name: GroqModelName _model_profile: ModelProfile _response: AsyncIterable[chat.ChatCompletionChunk] _timestamp: datetime _provider_name: str async def _get_event_iterator(self) -> AsyncIterator[ModelResponseStreamEvent]: try: async for chunk in self._response: self._usage += _map_usage(chunk) if chunk.id: # pragma: no branch self.provider_response_id = chunk.id try: choice = chunk.choices[0] except IndexError: continue if raw_finish_reason := choice.finish_reason: self.provider_details = {'finish_reason': raw_finish_reason} self.finish_reason = _FINISH_REASON_MAP.get(raw_finish_reason) # Handle the text part of the response content = choice.delta.content if content is not None: maybe_event = self._parts_manager.handle_text_delta( vendor_part_id='content', content=content, thinking_tags=self._model_profile.thinking_tags, ignore_leading_whitespace=self._model_profile.ignore_streamed_leading_whitespace, ) if maybe_event is not None: # pragma: no branch yield maybe_event # Handle the tool calls for dtc in choice.delta.tool_calls or []: maybe_event = self._parts_manager.handle_tool_call_delta( vendor_part_id=dtc.index, tool_name=dtc.function and dtc.function.name, args=dtc.function and dtc.function.arguments, tool_call_id=dtc.id, ) if maybe_event is not None: yield maybe_event except APIError as e: if isinstance(e.body, dict): # pragma: no branch # The Groq SDK tries to be helpful by raising an exception when generated tool arguments don't match the schema, # but we'd rather handle it ourselves so we can tell the model to retry the tool call try: error = _GroqToolUseFailedInnerError.model_validate(e.body) # pyright: ignore[reportUnknownMemberType] yield self._parts_manager.handle_tool_call_part( vendor_part_id='tool_use_failed', tool_name=error.failed_generation.name, args=error.failed_generation.arguments, ) return except ValidationError as e: # pragma: no cover pass raise # pragma: no cover @property def model_name(self) -> GroqModelName: """Get the model name of the response.""" return self._model_name @property def provider_name(self) -> str: """Get the provider name.""" return self._provider_name @property def timestamp(self) -> datetime: """Get the timestamp of the response.""" return self._timestamp ``` #### model_name ```python model_name: GroqModelName ``` Get the model name of the response. #### provider_name ```python provider_name: str ``` Get the provider name. #### timestamp ```python timestamp: datetime ``` Get the timestamp of the response. # `pydantic_ai.models.huggingface` ## Setup For details on how to set up authentication with this model, see [model configuration for Hugging Face](../../../models/huggingface/). ### HuggingFaceModelSettings Bases: `ModelSettings` Settings used for a Hugging Face model request. Source code in `pydantic_ai_slim/pydantic_ai/models/huggingface.py` ```python class HuggingFaceModelSettings(ModelSettings, total=False): """Settings used for a Hugging Face model request.""" ``` ### HuggingFaceModel Bases: `Model` A model that uses Hugging Face Inference Providers. Internally, this uses the [HF Python client](https://github.com/huggingface/huggingface_hub) to interact with the API. Apart from `__init__`, all methods are private or match those of the base class. Source code in `pydantic_ai_slim/pydantic_ai/models/huggingface.py` ```python @dataclass(init=False) class HuggingFaceModel(Model): """A model that uses Hugging Face Inference Providers. Internally, this uses the [HF Python client](https://github.com/huggingface/huggingface_hub) to interact with the API. Apart from `__init__`, all methods are private or match those of the base class. """ client: AsyncInferenceClient = field(repr=False) _model_name: str = field(repr=False) _provider: Provider[AsyncInferenceClient] = field(repr=False) def __init__( self, model_name: str, *, provider: Literal['huggingface'] | Provider[AsyncInferenceClient] = 'huggingface', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize a Hugging Face model. Args: model_name: The name of the Model to use. You can browse available models [here](https://huggingface.co/models?pipeline_tag=text-generation&inference_provider=all&sort=trending). provider: The provider to use for Hugging Face Inference Providers. Can be either the string 'huggingface' or an instance of `Provider[AsyncInferenceClient]`. If not provided, the other parameters will be used. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. settings: Model-specific settings that will be used as defaults for this model. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider(provider) self._provider = provider self.client = provider.client super().__init__(settings=settings, profile=profile or provider.model_profile) @property def model_name(self) -> HuggingFaceModelName: """The model name.""" return self._model_name @property def system(self) -> str: """The system / model provider.""" return self._provider.name async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: check_allow_model_requests() response = await self._completions_create( messages, False, cast(HuggingFaceModelSettings, model_settings or {}), model_request_parameters ) model_response = self._process_response(response) return model_response @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: check_allow_model_requests() response = await self._completions_create( messages, True, cast(HuggingFaceModelSettings, model_settings or {}), model_request_parameters ) yield await self._process_streamed_response(response, model_request_parameters) @overload async def _completions_create( self, messages: list[ModelMessage], stream: Literal[True], model_settings: HuggingFaceModelSettings, model_request_parameters: ModelRequestParameters, ) -> AsyncIterable[ChatCompletionStreamOutput]: ... @overload async def _completions_create( self, messages: list[ModelMessage], stream: Literal[False], model_settings: HuggingFaceModelSettings, model_request_parameters: ModelRequestParameters, ) -> ChatCompletionOutput: ... async def _completions_create( self, messages: list[ModelMessage], stream: bool, model_settings: HuggingFaceModelSettings, model_request_parameters: ModelRequestParameters, ) -> ChatCompletionOutput | AsyncIterable[ChatCompletionStreamOutput]: tools = self._get_tools(model_request_parameters) if not tools: tool_choice: Literal['none', 'required', 'auto'] | None = None elif not model_request_parameters.allow_text_output: tool_choice = 'required' else: tool_choice = 'auto' if model_request_parameters.builtin_tools: raise UserError('HuggingFace does not support built-in tools') hf_messages = await self._map_messages(messages) try: return await self.client.chat.completions.create( # type: ignore model=self._model_name, messages=hf_messages, # type: ignore tools=tools, tool_choice=tool_choice or None, stream=stream, stop=model_settings.get('stop_sequences', None), temperature=model_settings.get('temperature', None), top_p=model_settings.get('top_p', None), seed=model_settings.get('seed', None), presence_penalty=model_settings.get('presence_penalty', None), frequency_penalty=model_settings.get('frequency_penalty', None), logit_bias=model_settings.get('logit_bias', None), # type: ignore logprobs=model_settings.get('logprobs', None), top_logprobs=model_settings.get('top_logprobs', None), extra_body=model_settings.get('extra_body'), # type: ignore ) except aiohttp.ClientResponseError as e: raise ModelHTTPError( status_code=e.status, model_name=self.model_name, body=e.response_error_payload, # type: ignore ) from e except HfHubHTTPError as e: raise ModelHTTPError( status_code=e.response.status_code, model_name=self.model_name, body=e.response.content, ) from e def _process_response(self, response: ChatCompletionOutput) -> ModelResponse: """Process a non-streamed response, and prepare a message to return.""" if response.created: timestamp = datetime.fromtimestamp(response.created, tz=timezone.utc) else: timestamp = _now_utc() choice = response.choices[0] content = choice.message.content tool_calls = choice.message.tool_calls items: list[ModelResponsePart] = [] if content is not None: items.extend(split_content_into_text_and_thinking(content, self.profile.thinking_tags)) if tool_calls is not None: for c in tool_calls: items.append(ToolCallPart(c.function.name, c.function.arguments, tool_call_id=c.id)) raw_finish_reason = choice.finish_reason provider_details = {'finish_reason': raw_finish_reason} finish_reason = _FINISH_REASON_MAP.get(cast(TextGenerationOutputFinishReason, raw_finish_reason), None) return ModelResponse( parts=items, usage=_map_usage(response), model_name=response.model, timestamp=timestamp, provider_response_id=response.id, provider_name=self._provider.name, finish_reason=finish_reason, provider_details=provider_details, ) async def _process_streamed_response( self, response: AsyncIterable[ChatCompletionStreamOutput], model_request_parameters: ModelRequestParameters ) -> StreamedResponse: """Process a streamed response, and prepare a streaming response to return.""" peekable_response = _utils.PeekableAsyncStream(response) first_chunk = await peekable_response.peek() if isinstance(first_chunk, _utils.Unset): raise UnexpectedModelBehavior( # pragma: no cover 'Streamed response ended without content or tool calls' ) return HuggingFaceStreamedResponse( model_request_parameters=model_request_parameters, _model_name=first_chunk.model, _model_profile=self.profile, _response=peekable_response, _timestamp=datetime.fromtimestamp(first_chunk.created, tz=timezone.utc), _provider_name=self._provider.name, ) def _get_tools(self, model_request_parameters: ModelRequestParameters) -> list[ChatCompletionInputTool]: return [self._map_tool_definition(r) for r in model_request_parameters.tool_defs.values()] async def _map_messages( self, messages: list[ModelMessage] ) -> list[ChatCompletionInputMessage | ChatCompletionOutputMessage]: """Just maps a `pydantic_ai.Message` to a `huggingface_hub.ChatCompletionInputMessage`.""" hf_messages: list[ChatCompletionInputMessage | ChatCompletionOutputMessage] = [] for message in messages: if isinstance(message, ModelRequest): async for item in self._map_user_message(message): hf_messages.append(item) elif isinstance(message, ModelResponse): texts: list[str] = [] tool_calls: list[ChatCompletionInputToolCall] = [] for item in message.parts: if isinstance(item, TextPart): texts.append(item.content) elif isinstance(item, ToolCallPart): tool_calls.append(self._map_tool_call(item)) elif isinstance(item, ThinkingPart): start_tag, end_tag = self.profile.thinking_tags texts.append('\n'.join([start_tag, item.content, end_tag])) elif isinstance(item, BuiltinToolCallPart | BuiltinToolReturnPart): # pragma: no cover # This is currently never returned from huggingface pass else: assert_never(item) message_param = ChatCompletionInputMessage(role='assistant') # type: ignore if texts: # Note: model responses from this model should only have one text item, so the following # shouldn't merge multiple texts into one unless you switch models between runs: message_param['content'] = '\n\n'.join(texts) if tool_calls: message_param['tool_calls'] = tool_calls hf_messages.append(message_param) else: assert_never(message) if instructions := self._get_instructions(messages): hf_messages.insert(0, ChatCompletionInputMessage(content=instructions, role='system')) # type: ignore return hf_messages @staticmethod def _map_tool_call(t: ToolCallPart) -> ChatCompletionInputToolCall: return ChatCompletionInputToolCall.parse_obj_as_instance( # type: ignore { 'id': _guard_tool_call_id(t=t), 'type': 'function', 'function': { 'name': t.tool_name, 'arguments': t.args_as_json_str(), }, } ) @staticmethod def _map_tool_definition(f: ToolDefinition) -> ChatCompletionInputTool: tool_param: ChatCompletionInputTool = ChatCompletionInputTool.parse_obj_as_instance( # type: ignore { 'type': 'function', 'function': { 'name': f.name, 'description': f.description, 'parameters': f.parameters_json_schema, }, } ) if f.strict is not None: tool_param['function']['strict'] = f.strict return tool_param async def _map_user_message( self, message: ModelRequest ) -> AsyncIterable[ChatCompletionInputMessage | ChatCompletionOutputMessage]: for part in message.parts: if isinstance(part, SystemPromptPart): yield ChatCompletionInputMessage.parse_obj_as_instance({'role': 'system', 'content': part.content}) # type: ignore elif isinstance(part, UserPromptPart): yield await self._map_user_prompt(part) elif isinstance(part, ToolReturnPart): yield ChatCompletionOutputMessage.parse_obj_as_instance( # type: ignore { 'role': 'tool', 'tool_call_id': _guard_tool_call_id(t=part), 'content': part.model_response_str(), } ) elif isinstance(part, RetryPromptPart): if part.tool_name is None: yield ChatCompletionInputMessage.parse_obj_as_instance( # type: ignore {'role': 'user', 'content': part.model_response()} ) else: yield ChatCompletionInputMessage.parse_obj_as_instance( # type: ignore { 'role': 'tool', 'tool_call_id': _guard_tool_call_id(t=part), 'content': part.model_response(), } ) else: assert_never(part) @staticmethod async def _map_user_prompt(part: UserPromptPart) -> ChatCompletionInputMessage: content: str | list[ChatCompletionInputMessage] if isinstance(part.content, str): content = part.content else: content = [] for item in part.content: if isinstance(item, str): content.append(ChatCompletionInputMessageChunk(type='text', text=item)) # type: ignore elif isinstance(item, ImageUrl): url = ChatCompletionInputURL(url=item.url) # type: ignore content.append(ChatCompletionInputMessageChunk(type='image_url', image_url=url)) # type: ignore elif isinstance(item, BinaryContent): base64_encoded = base64.b64encode(item.data).decode('utf-8') if item.is_image: url = ChatCompletionInputURL(url=f'data:{item.media_type};base64,{base64_encoded}') # type: ignore content.append(ChatCompletionInputMessageChunk(type='image_url', image_url=url)) # type: ignore else: # pragma: no cover raise RuntimeError(f'Unsupported binary content type: {item.media_type}') elif isinstance(item, AudioUrl): raise NotImplementedError('AudioUrl is not supported for Hugging Face') elif isinstance(item, DocumentUrl): raise NotImplementedError('DocumentUrl is not supported for Hugging Face') elif isinstance(item, VideoUrl): raise NotImplementedError('VideoUrl is not supported for Hugging Face') else: assert_never(item) return ChatCompletionInputMessage(role='user', content=content) # type: ignore ``` #### __init__ ```python __init__( model_name: str, *, provider: ( Literal["huggingface"] | Provider[AsyncInferenceClient] ) = "huggingface", profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None ) ``` Initialize a Hugging Face model. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `model_name` | `str` | The name of the Model to use. You can browse available models here. | *required* | | `provider` | `Literal['huggingface'] | Provider[AsyncInferenceClient]` | The provider to use for Hugging Face Inference Providers. Can be either the string 'huggingface' or an instance of Provider[AsyncInferenceClient]. If not provided, the other parameters will be used. | `'huggingface'` | | `profile` | `ModelProfileSpec | None` | The model profile to use. Defaults to a profile picked by the provider based on the model name. | `None` | | `settings` | `ModelSettings | None` | Model-specific settings that will be used as defaults for this model. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/models/huggingface.py` ```python def __init__( self, model_name: str, *, provider: Literal['huggingface'] | Provider[AsyncInferenceClient] = 'huggingface', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize a Hugging Face model. Args: model_name: The name of the Model to use. You can browse available models [here](https://huggingface.co/models?pipeline_tag=text-generation&inference_provider=all&sort=trending). provider: The provider to use for Hugging Face Inference Providers. Can be either the string 'huggingface' or an instance of `Provider[AsyncInferenceClient]`. If not provided, the other parameters will be used. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. settings: Model-specific settings that will be used as defaults for this model. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider(provider) self._provider = provider self.client = provider.client super().__init__(settings=settings, profile=profile or provider.model_profile) ``` #### model_name ```python model_name: HuggingFaceModelName ``` The model name. #### system ```python system: str ``` The system / model provider. # pydantic_ai.models.instrumented ### instrument_model ```python instrument_model( model: Model, instrument: InstrumentationSettings | bool ) -> Model ``` Instrument a model with OpenTelemetry/logfire. Source code in `pydantic_ai_slim/pydantic_ai/models/instrumented.py` ```python def instrument_model(model: Model, instrument: InstrumentationSettings | bool) -> Model: """Instrument a model with OpenTelemetry/logfire.""" if instrument and not isinstance(model, InstrumentedModel): if instrument is True: instrument = InstrumentationSettings() model = InstrumentedModel(model, instrument) return model ``` ### InstrumentationSettings Options for instrumenting models and agents with OpenTelemetry. Used in: - `Agent(instrument=...)` - Agent.instrument_all() - InstrumentedModel See the [Debugging and Monitoring guide](https://ai.pydantic.dev/logfire/) for more info. Source code in `pydantic_ai_slim/pydantic_ai/models/instrumented.py` ```python @dataclass(init=False) class InstrumentationSettings: """Options for instrumenting models and agents with OpenTelemetry. Used in: - `Agent(instrument=...)` - [`Agent.instrument_all()`][pydantic_ai.agent.Agent.instrument_all] - [`InstrumentedModel`][pydantic_ai.models.instrumented.InstrumentedModel] See the [Debugging and Monitoring guide](https://ai.pydantic.dev/logfire/) for more info. """ tracer: Tracer = field(repr=False) event_logger: EventLogger = field(repr=False) event_mode: Literal['attributes', 'logs'] = 'attributes' include_binary_content: bool = True include_content: bool = True version: Literal[1, 2] = 1 def __init__( self, *, tracer_provider: TracerProvider | None = None, meter_provider: MeterProvider | None = None, include_binary_content: bool = True, include_content: bool = True, version: Literal[1, 2] = 2, event_mode: Literal['attributes', 'logs'] = 'attributes', event_logger_provider: EventLoggerProvider | None = None, ): """Create instrumentation options. Args: tracer_provider: The OpenTelemetry tracer provider to use. If not provided, the global tracer provider is used. Calling `logfire.configure()` sets the global tracer provider, so most users don't need this. meter_provider: The OpenTelemetry meter provider to use. If not provided, the global meter provider is used. Calling `logfire.configure()` sets the global meter provider, so most users don't need this. include_binary_content: Whether to include binary content in the instrumentation events. include_content: Whether to include prompts, completions, and tool call arguments and responses in the instrumentation events. version: Version of the data format. This is unrelated to the Pydantic AI package version. Version 1 is based on the legacy event-based OpenTelemetry GenAI spec and will be removed in a future release. The parameters `event_mode` and `event_logger_provider` are only relevant for version 1. Version 2 uses the newer OpenTelemetry GenAI spec and stores messages in the following attributes: - `gen_ai.system_instructions` for instructions passed to the agent. - `gen_ai.input.messages` and `gen_ai.output.messages` on model request spans. - `pydantic_ai.all_messages` on agent run spans. event_mode: The mode for emitting events in version 1. If `'attributes'`, events are attached to the span as attributes. If `'logs'`, events are emitted as OpenTelemetry log-based events. event_logger_provider: The OpenTelemetry event logger provider to use. If not provided, the global event logger provider is used. Calling `logfire.configure()` sets the global event logger provider, so most users don't need this. This is only used if `event_mode='logs'` and `version=1`. """ from pydantic_ai import __version__ tracer_provider = tracer_provider or get_tracer_provider() meter_provider = meter_provider or get_meter_provider() event_logger_provider = event_logger_provider or get_event_logger_provider() scope_name = 'pydantic-ai' self.tracer = tracer_provider.get_tracer(scope_name, __version__) self.meter = meter_provider.get_meter(scope_name, __version__) self.event_logger = event_logger_provider.get_event_logger(scope_name, __version__) self.event_mode = event_mode self.include_binary_content = include_binary_content self.include_content = include_content if event_mode == 'logs' and version != 1: warnings.warn( 'event_mode is only relevant for version=1 which is deprecated and will be removed in a future release.', stacklevel=2, ) version = 1 self.version = version # As specified in the OpenTelemetry GenAI metrics spec: # https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-metrics/#metric-gen_aiclienttokenusage tokens_histogram_kwargs = dict( name='gen_ai.client.token.usage', unit='{token}', description='Measures number of input and output tokens used', ) try: self.tokens_histogram = self.meter.create_histogram( **tokens_histogram_kwargs, explicit_bucket_boundaries_advisory=TOKEN_HISTOGRAM_BOUNDARIES, ) except TypeError: # pragma: lax no cover # Older OTel/logfire versions don't support explicit_bucket_boundaries_advisory self.tokens_histogram = self.meter.create_histogram( **tokens_histogram_kwargs, # pyright: ignore ) def messages_to_otel_events(self, messages: list[ModelMessage]) -> list[Event]: """Convert a list of model messages to OpenTelemetry events. Args: messages: The messages to convert. Returns: A list of OpenTelemetry events. """ events: list[Event] = [] instructions = InstrumentedModel._get_instructions(messages) # pyright: ignore [reportPrivateUsage] if instructions is not None: events.append( Event( 'gen_ai.system.message', body={**({'content': instructions} if self.include_content else {}), 'role': 'system'}, ) ) for message_index, message in enumerate(messages): message_events: list[Event] = [] if isinstance(message, ModelRequest): for part in message.parts: if hasattr(part, 'otel_event'): message_events.append(part.otel_event(self)) elif isinstance(message, ModelResponse): # pragma: no branch message_events = message.otel_events(self) for event in message_events: event.attributes = { 'gen_ai.message.index': message_index, **(event.attributes or {}), } events.extend(message_events) for event in events: event.body = InstrumentedModel.serialize_any(event.body) return events def messages_to_otel_messages(self, messages: list[ModelMessage]) -> list[_otel_messages.ChatMessage]: result: list[_otel_messages.ChatMessage] = [] for message in messages: if isinstance(message, ModelRequest): for is_system, group in itertools.groupby(message.parts, key=lambda p: isinstance(p, SystemPromptPart)): message_parts: list[_otel_messages.MessagePart] = [] for part in group: if hasattr(part, 'otel_message_parts'): message_parts.extend(part.otel_message_parts(self)) result.append( _otel_messages.ChatMessage(role='system' if is_system else 'user', parts=message_parts) ) elif isinstance(message, ModelResponse): # pragma: no branch otel_message = _otel_messages.OutputMessage(role='assistant', parts=message.otel_message_parts(self)) if message.finish_reason is not None: otel_message['finish_reason'] = message.finish_reason result.append(otel_message) return result def handle_messages(self, input_messages: list[ModelMessage], response: ModelResponse, system: str, span: Span): if self.version == 1: events = self.messages_to_otel_events(input_messages) for event in self.messages_to_otel_events([response]): events.append( Event( 'gen_ai.choice', body={ 'index': 0, 'message': event.body, }, ) ) for event in events: event.attributes = { GEN_AI_SYSTEM_ATTRIBUTE: system, **(event.attributes or {}), } self._emit_events(span, events) else: output_messages = self.messages_to_otel_messages([response]) assert len(output_messages) == 1 output_message = output_messages[0] instructions = InstrumentedModel._get_instructions(input_messages) # pyright: ignore [reportPrivateUsage] system_instructions_attributes = self.system_instructions_attributes(instructions) attributes: dict[str, AttributeValue] = { 'gen_ai.input.messages': json.dumps(self.messages_to_otel_messages(input_messages)), 'gen_ai.output.messages': json.dumps([output_message]), **system_instructions_attributes, 'logfire.json_schema': json.dumps( { 'type': 'object', 'properties': { 'gen_ai.input.messages': {'type': 'array'}, 'gen_ai.output.messages': {'type': 'array'}, **( {'gen_ai.system_instructions': {'type': 'array'}} if system_instructions_attributes else {} ), 'model_request_parameters': {'type': 'object'}, }, } ), } span.set_attributes(attributes) def system_instructions_attributes(self, instructions: str | None) -> dict[str, str]: if instructions and self.include_content: return { 'gen_ai.system_instructions': json.dumps([_otel_messages.TextPart(type='text', content=instructions)]), } return {} def _emit_events(self, span: Span, events: list[Event]) -> None: if self.event_mode == 'logs': for event in events: self.event_logger.emit(event) else: attr_name = 'events' span.set_attributes( { attr_name: json.dumps([InstrumentedModel.event_to_dict(event) for event in events]), 'logfire.json_schema': json.dumps( { 'type': 'object', 'properties': { attr_name: {'type': 'array'}, 'model_request_parameters': {'type': 'object'}, }, } ), } ) ``` #### __init__ ```python __init__( *, tracer_provider: TracerProvider | None = None, meter_provider: MeterProvider | None = None, include_binary_content: bool = True, include_content: bool = True, version: Literal[1, 2] = 2, event_mode: Literal[ "attributes", "logs" ] = "attributes", event_logger_provider: EventLoggerProvider | None = None ) ``` Create instrumentation options. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `tracer_provider` | `TracerProvider | None` | The OpenTelemetry tracer provider to use. If not provided, the global tracer provider is used. Calling logfire.configure() sets the global tracer provider, so most users don't need this. | `None` | | `meter_provider` | `MeterProvider | None` | The OpenTelemetry meter provider to use. If not provided, the global meter provider is used. Calling logfire.configure() sets the global meter provider, so most users don't need this. | `None` | | `include_binary_content` | `bool` | Whether to include binary content in the instrumentation events. | `True` | | `include_content` | `bool` | Whether to include prompts, completions, and tool call arguments and responses in the instrumentation events. | `True` | | `version` | `Literal[1, 2]` | Version of the data format. This is unrelated to the Pydantic AI package version. Version 1 is based on the legacy event-based OpenTelemetry GenAI spec and will be removed in a future release. The parameters event_mode and event_logger_provider are only relevant for version 1. Version 2 uses the newer OpenTelemetry GenAI spec and stores messages in the following attributes: - gen_ai.system_instructions for instructions passed to the agent. - gen_ai.input.messages and gen_ai.output.messages on model request spans. - pydantic_ai.all_messages on agent run spans. | `2` | | `event_mode` | `Literal['attributes', 'logs']` | The mode for emitting events in version 1. If 'attributes', events are attached to the span as attributes. If 'logs', events are emitted as OpenTelemetry log-based events. | `'attributes'` | | `event_logger_provider` | `EventLoggerProvider | None` | The OpenTelemetry event logger provider to use. If not provided, the global event logger provider is used. Calling logfire.configure() sets the global event logger provider, so most users don't need this. This is only used if event_mode='logs' and version=1. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/models/instrumented.py` ```python def __init__( self, *, tracer_provider: TracerProvider | None = None, meter_provider: MeterProvider | None = None, include_binary_content: bool = True, include_content: bool = True, version: Literal[1, 2] = 2, event_mode: Literal['attributes', 'logs'] = 'attributes', event_logger_provider: EventLoggerProvider | None = None, ): """Create instrumentation options. Args: tracer_provider: The OpenTelemetry tracer provider to use. If not provided, the global tracer provider is used. Calling `logfire.configure()` sets the global tracer provider, so most users don't need this. meter_provider: The OpenTelemetry meter provider to use. If not provided, the global meter provider is used. Calling `logfire.configure()` sets the global meter provider, so most users don't need this. include_binary_content: Whether to include binary content in the instrumentation events. include_content: Whether to include prompts, completions, and tool call arguments and responses in the instrumentation events. version: Version of the data format. This is unrelated to the Pydantic AI package version. Version 1 is based on the legacy event-based OpenTelemetry GenAI spec and will be removed in a future release. The parameters `event_mode` and `event_logger_provider` are only relevant for version 1. Version 2 uses the newer OpenTelemetry GenAI spec and stores messages in the following attributes: - `gen_ai.system_instructions` for instructions passed to the agent. - `gen_ai.input.messages` and `gen_ai.output.messages` on model request spans. - `pydantic_ai.all_messages` on agent run spans. event_mode: The mode for emitting events in version 1. If `'attributes'`, events are attached to the span as attributes. If `'logs'`, events are emitted as OpenTelemetry log-based events. event_logger_provider: The OpenTelemetry event logger provider to use. If not provided, the global event logger provider is used. Calling `logfire.configure()` sets the global event logger provider, so most users don't need this. This is only used if `event_mode='logs'` and `version=1`. """ from pydantic_ai import __version__ tracer_provider = tracer_provider or get_tracer_provider() meter_provider = meter_provider or get_meter_provider() event_logger_provider = event_logger_provider or get_event_logger_provider() scope_name = 'pydantic-ai' self.tracer = tracer_provider.get_tracer(scope_name, __version__) self.meter = meter_provider.get_meter(scope_name, __version__) self.event_logger = event_logger_provider.get_event_logger(scope_name, __version__) self.event_mode = event_mode self.include_binary_content = include_binary_content self.include_content = include_content if event_mode == 'logs' and version != 1: warnings.warn( 'event_mode is only relevant for version=1 which is deprecated and will be removed in a future release.', stacklevel=2, ) version = 1 self.version = version # As specified in the OpenTelemetry GenAI metrics spec: # https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-metrics/#metric-gen_aiclienttokenusage tokens_histogram_kwargs = dict( name='gen_ai.client.token.usage', unit='{token}', description='Measures number of input and output tokens used', ) try: self.tokens_histogram = self.meter.create_histogram( **tokens_histogram_kwargs, explicit_bucket_boundaries_advisory=TOKEN_HISTOGRAM_BOUNDARIES, ) except TypeError: # pragma: lax no cover # Older OTel/logfire versions don't support explicit_bucket_boundaries_advisory self.tokens_histogram = self.meter.create_histogram( **tokens_histogram_kwargs, # pyright: ignore ) ``` #### messages_to_otel_events ```python messages_to_otel_events( messages: list[ModelMessage], ) -> list[Event] ``` Convert a list of model messages to OpenTelemetry events. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `messages` | `list[ModelMessage]` | The messages to convert. | *required* | Returns: | Type | Description | | --- | --- | | `list[Event]` | A list of OpenTelemetry events. | Source code in `pydantic_ai_slim/pydantic_ai/models/instrumented.py` ```python def messages_to_otel_events(self, messages: list[ModelMessage]) -> list[Event]: """Convert a list of model messages to OpenTelemetry events. Args: messages: The messages to convert. Returns: A list of OpenTelemetry events. """ events: list[Event] = [] instructions = InstrumentedModel._get_instructions(messages) # pyright: ignore [reportPrivateUsage] if instructions is not None: events.append( Event( 'gen_ai.system.message', body={**({'content': instructions} if self.include_content else {}), 'role': 'system'}, ) ) for message_index, message in enumerate(messages): message_events: list[Event] = [] if isinstance(message, ModelRequest): for part in message.parts: if hasattr(part, 'otel_event'): message_events.append(part.otel_event(self)) elif isinstance(message, ModelResponse): # pragma: no branch message_events = message.otel_events(self) for event in message_events: event.attributes = { 'gen_ai.message.index': message_index, **(event.attributes or {}), } events.extend(message_events) for event in events: event.body = InstrumentedModel.serialize_any(event.body) return events ``` ### InstrumentedModel Bases: `WrapperModel` Model which wraps another model so that requests are instrumented with OpenTelemetry. See the [Debugging and Monitoring guide](https://ai.pydantic.dev/logfire/) for more info. Source code in `pydantic_ai_slim/pydantic_ai/models/instrumented.py` ```python @dataclass(init=False) class InstrumentedModel(WrapperModel): """Model which wraps another model so that requests are instrumented with OpenTelemetry. See the [Debugging and Monitoring guide](https://ai.pydantic.dev/logfire/) for more info. """ instrumentation_settings: InstrumentationSettings """Instrumentation settings for this model.""" def __init__( self, wrapped: Model | KnownModelName, options: InstrumentationSettings | None = None, ) -> None: super().__init__(wrapped) self.instrumentation_settings = options or InstrumentationSettings() async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: with self._instrument(messages, model_settings, model_request_parameters) as finish: response = await super().request(messages, model_settings, model_request_parameters) finish(response) return response @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: with self._instrument(messages, model_settings, model_request_parameters) as finish: response_stream: StreamedResponse | None = None try: async with super().request_stream( messages, model_settings, model_request_parameters, run_context ) as response_stream: yield response_stream finally: if response_stream: # pragma: no branch finish(response_stream.get()) @contextmanager def _instrument( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> Iterator[Callable[[ModelResponse], None]]: operation = 'chat' span_name = f'{operation} {self.model_name}' # TODO Missing attributes: # - error.type: unclear if we should do something here or just always rely on span exceptions # - gen_ai.request.stop_sequences/top_k: model_settings doesn't include these attributes: dict[str, AttributeValue] = { 'gen_ai.operation.name': operation, **self.model_attributes(self.wrapped), 'model_request_parameters': json.dumps(InstrumentedModel.serialize_any(model_request_parameters)), 'logfire.json_schema': json.dumps( { 'type': 'object', 'properties': {'model_request_parameters': {'type': 'object'}}, } ), } if model_settings: for key in MODEL_SETTING_ATTRIBUTES: if isinstance(value := model_settings.get(key), float | int): attributes[f'gen_ai.request.{key}'] = value record_metrics: Callable[[], None] | None = None try: with self.instrumentation_settings.tracer.start_as_current_span(span_name, attributes=attributes) as span: def finish(response: ModelResponse): # FallbackModel updates these span attributes. attributes.update(getattr(span, 'attributes', {})) request_model = attributes[GEN_AI_REQUEST_MODEL_ATTRIBUTE] system = cast(str, attributes[GEN_AI_SYSTEM_ATTRIBUTE]) response_model = response.model_name or request_model def _record_metrics(): metric_attributes = { GEN_AI_SYSTEM_ATTRIBUTE: system, 'gen_ai.operation.name': operation, 'gen_ai.request.model': request_model, 'gen_ai.response.model': response_model, } if response.usage.input_tokens: # pragma: no branch self.instrumentation_settings.tokens_histogram.record( response.usage.input_tokens, {**metric_attributes, 'gen_ai.token.type': 'input'}, ) if response.usage.output_tokens: # pragma: no branch self.instrumentation_settings.tokens_histogram.record( response.usage.output_tokens, {**metric_attributes, 'gen_ai.token.type': 'output'}, ) nonlocal record_metrics record_metrics = _record_metrics if not span.is_recording(): return self.instrumentation_settings.handle_messages(messages, response, system, span) attributes_to_set = { **response.usage.opentelemetry_attributes(), 'gen_ai.response.model': response_model, } try: attributes_to_set['operation.cost'] = float(response.cost().total_price) except LookupError: # The cost of this provider/model is unknown, which is common. pass except Exception as e: warnings.warn( f'Failed to get cost from response: {type(e).__name__}: {e}', CostCalculationFailedWarning ) if response.provider_response_id is not None: attributes_to_set['gen_ai.response.id'] = response.provider_response_id if response.finish_reason is not None: attributes_to_set['gen_ai.response.finish_reasons'] = [response.finish_reason] span.set_attributes(attributes_to_set) span.update_name(f'{operation} {request_model}') yield finish finally: if record_metrics: # We only want to record metrics after the span is finished, # to prevent them from being redundantly recorded in the span itself by logfire. record_metrics() @staticmethod def model_attributes(model: Model): attributes: dict[str, AttributeValue] = { GEN_AI_SYSTEM_ATTRIBUTE: model.system, GEN_AI_REQUEST_MODEL_ATTRIBUTE: model.model_name, } if base_url := model.base_url: try: parsed = urlparse(base_url) except Exception: # pragma: no cover pass else: if parsed.hostname: # pragma: no branch attributes['server.address'] = parsed.hostname if parsed.port: # pragma: no branch attributes['server.port'] = parsed.port return attributes @staticmethod def event_to_dict(event: Event) -> dict[str, Any]: if not event.body: body = {} # pragma: no cover elif isinstance(event.body, Mapping): body = event.body # type: ignore else: body = {'body': event.body} return {**body, **(event.attributes or {})} @staticmethod def serialize_any(value: Any) -> str: try: return ANY_ADAPTER.dump_python(value, mode='json') except Exception: try: return str(value) except Exception as e: return f'Unable to serialize: {e}' ``` #### instrumentation_settings ```python instrumentation_settings: InstrumentationSettings = ( options or InstrumentationSettings() ) ``` Instrumentation settings for this model. # pydantic_ai.models.mcp_sampling ### MCPSamplingModelSettings Bases: `ModelSettings` Settings used for an MCP Sampling model request. Source code in `pydantic_ai_slim/pydantic_ai/models/mcp_sampling.py` ```python class MCPSamplingModelSettings(ModelSettings, total=False): """Settings used for an MCP Sampling model request.""" # ALL FIELDS MUST BE `mcp_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. mcp_model_preferences: ModelPreferences """Model preferences to use for MCP Sampling.""" ``` #### mcp_model_preferences ```python mcp_model_preferences: ModelPreferences ``` Model preferences to use for MCP Sampling. ### MCPSamplingModel Bases: `Model` A model that uses MCP Sampling. [MCP Sampling](https://modelcontextprotocol.io/docs/concepts/sampling) allows an MCP server to make requests to a model by calling back to the MCP client that connected to it. Source code in `pydantic_ai_slim/pydantic_ai/models/mcp_sampling.py` ```python @dataclass class MCPSamplingModel(Model): """A model that uses MCP Sampling. [MCP Sampling](https://modelcontextprotocol.io/docs/concepts/sampling) allows an MCP server to make requests to a model by calling back to the MCP client that connected to it. """ session: ServerSession """The MCP server session to use for sampling.""" _: KW_ONLY default_max_tokens: int = 16_384 """Default max tokens to use if not set in [`ModelSettings`][pydantic_ai.settings.ModelSettings.max_tokens]. Max tokens is a required parameter for MCP Sampling, but optional on [`ModelSettings`][pydantic_ai.settings.ModelSettings], so this value is used as fallback. """ async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: system_prompt, sampling_messages = _mcp.map_from_pai_messages(messages) model_settings = cast(MCPSamplingModelSettings, model_settings or {}) result = await self.session.create_message( sampling_messages, max_tokens=model_settings.get('max_tokens', self.default_max_tokens), system_prompt=system_prompt, temperature=model_settings.get('temperature'), model_preferences=model_settings.get('mcp_model_preferences'), stop_sequences=model_settings.get('stop_sequences'), ) if result.role == 'assistant': return ModelResponse( parts=[_mcp.map_from_sampling_content(result.content)], model_name=result.model, ) else: raise exceptions.UnexpectedModelBehavior( f'Unexpected result from MCP sampling, expected "assistant" role, got {result.role}.' ) @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: raise NotImplementedError('MCP Sampling does not support streaming') yield @property def model_name(self) -> str: """The model name. Since the model name isn't known until the request is made, this property always returns `'mcp-sampling'`. """ return 'mcp-sampling' @property def system(self) -> str: """The system / model provider, returns `'MCP'`.""" return 'MCP' ``` #### session ```python session: ServerSession ``` The MCP server session to use for sampling. #### default_max_tokens ```python default_max_tokens: int = 16384 ``` Default max tokens to use if not set in ModelSettings. Max tokens is a required parameter for MCP Sampling, but optional on ModelSettings, so this value is used as fallback. #### model_name ```python model_name: str ``` The model name. Since the model name isn't known until the request is made, this property always returns `'mcp-sampling'`. #### system ```python system: str ``` The system / model provider, returns `'MCP'`. # `pydantic_ai.models.mistral` ## Setup For details on how to set up authentication with this model, see [model configuration for Mistral](../../../models/mistral/). ### LatestMistralModelNames ```python LatestMistralModelNames = Literal[ "mistral-large-latest", "mistral-small-latest", "codestral-latest", "mistral-moderation-latest", ] ``` Latest Mistral models. ### MistralModelName ```python MistralModelName = str | LatestMistralModelNames ``` Possible Mistral model names. Since Mistral supports a variety of date-stamped models, we explicitly list the most popular models but allow any name in the type hints. Since [the Mistral docs](https://docs.mistral.ai/getting-started/models/models_overview/) for a full list. ### MistralModelSettings Bases: `ModelSettings` Settings used for a Mistral model request. Source code in `pydantic_ai_slim/pydantic_ai/models/mistral.py` ```python class MistralModelSettings(ModelSettings, total=False): """Settings used for a Mistral model request.""" ``` ### MistralModel Bases: `Model` A model that uses Mistral. Internally, this uses the [Mistral Python client](https://github.com/mistralai/client-python) to interact with the API. [API Documentation](https://docs.mistral.ai/) Source code in `pydantic_ai_slim/pydantic_ai/models/mistral.py` ````python @dataclass(init=False) class MistralModel(Model): """A model that uses Mistral. Internally, this uses the [Mistral Python client](https://github.com/mistralai/client-python) to interact with the API. [API Documentation](https://docs.mistral.ai/) """ client: Mistral = field(repr=False) json_mode_schema_prompt: str _model_name: MistralModelName = field(repr=False) _provider: Provider[Mistral] = field(repr=False) def __init__( self, model_name: MistralModelName, *, provider: Literal['mistral'] | Provider[Mistral] = 'mistral', profile: ModelProfileSpec | None = None, json_mode_schema_prompt: str = """Answer in JSON Object, respect the format:\n```\n{schema}\n```\n""", settings: ModelSettings | None = None, ): """Initialize a Mistral model. Args: model_name: The name of the model to use. provider: The provider to use for authentication and API access. Can be either the string 'mistral' or an instance of `Provider[Mistral]`. If not provided, a new provider will be created using the other parameters. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. json_mode_schema_prompt: The prompt to show when the model expects a JSON object as input. settings: Model-specific settings that will be used as defaults for this model. """ self._model_name = model_name self.json_mode_schema_prompt = json_mode_schema_prompt if isinstance(provider, str): provider = infer_provider(provider) self._provider = provider self.client = provider.client super().__init__(settings=settings, profile=profile or provider.model_profile) @property def base_url(self) -> str: return self._provider.base_url @property def model_name(self) -> MistralModelName: """The model name.""" return self._model_name @property def system(self) -> str: """The model provider.""" return self._provider.name async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: """Make a non-streaming request to the model from Pydantic AI call.""" check_allow_model_requests() response = await self._completions_create( messages, cast(MistralModelSettings, model_settings or {}), model_request_parameters ) model_response = self._process_response(response) return model_response @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: """Make a streaming request to the model from Pydantic AI call.""" check_allow_model_requests() response = await self._stream_completions_create( messages, cast(MistralModelSettings, model_settings or {}), model_request_parameters ) async with response: yield await self._process_streamed_response(response, model_request_parameters) async def _completions_create( self, messages: list[ModelMessage], model_settings: MistralModelSettings, model_request_parameters: ModelRequestParameters, ) -> MistralChatCompletionResponse: """Make a non-streaming request to the model.""" # TODO(Marcelo): We need to replace the current MistralAI client to use the beta client. # See https://docs.mistral.ai/agents/connectors/websearch/ to support web search. if model_request_parameters.builtin_tools: raise UserError('Mistral does not support built-in tools') try: response = await self.client.chat.complete_async( model=str(self._model_name), messages=self._map_messages(messages), n=1, tools=self._map_function_and_output_tools_definition(model_request_parameters) or UNSET, tool_choice=self._get_tool_choice(model_request_parameters), stream=False, max_tokens=model_settings.get('max_tokens', UNSET), temperature=model_settings.get('temperature', UNSET), top_p=model_settings.get('top_p', 1), timeout_ms=self._get_timeout_ms(model_settings.get('timeout')), random_seed=model_settings.get('seed', UNSET), stop=model_settings.get('stop_sequences', None), http_headers={'User-Agent': get_user_agent()}, ) except SDKError as e: if (status_code := e.status_code) >= 400: raise ModelHTTPError(status_code=status_code, model_name=self.model_name, body=e.body) from e raise # pragma: lax no cover assert response, 'A unexpected empty response from Mistral.' return response async def _stream_completions_create( self, messages: list[ModelMessage], model_settings: MistralModelSettings, model_request_parameters: ModelRequestParameters, ) -> MistralEventStreamAsync[MistralCompletionEvent]: """Create a streaming completion request to the Mistral model.""" response: MistralEventStreamAsync[MistralCompletionEvent] | None mistral_messages = self._map_messages(messages) # TODO(Marcelo): We need to replace the current MistralAI client to use the beta client. # See https://docs.mistral.ai/agents/connectors/websearch/ to support web search. if model_request_parameters.builtin_tools: raise UserError('Mistral does not support built-in tools') if model_request_parameters.function_tools: # Function Calling response = await self.client.chat.stream_async( model=str(self._model_name), messages=mistral_messages, n=1, tools=self._map_function_and_output_tools_definition(model_request_parameters) or UNSET, tool_choice=self._get_tool_choice(model_request_parameters), temperature=model_settings.get('temperature', UNSET), top_p=model_settings.get('top_p', 1), max_tokens=model_settings.get('max_tokens', UNSET), timeout_ms=self._get_timeout_ms(model_settings.get('timeout')), presence_penalty=model_settings.get('presence_penalty'), frequency_penalty=model_settings.get('frequency_penalty'), stop=model_settings.get('stop_sequences', None), http_headers={'User-Agent': get_user_agent()}, ) elif model_request_parameters.output_tools: # TODO: Port to native "manual JSON" mode # Json Mode parameters_json_schemas = [tool.parameters_json_schema for tool in model_request_parameters.output_tools] user_output_format_message = self._generate_user_output_format(parameters_json_schemas) mistral_messages.append(user_output_format_message) response = await self.client.chat.stream_async( model=str(self._model_name), messages=mistral_messages, response_format={ 'type': 'json_object' }, # TODO: Should be able to use json_schema now: https://docs.mistral.ai/capabilities/structured-output/custom_structured_output/, https://github.com/mistralai/client-python/blob/bc4adf335968c8a272e1ab7da8461c9943d8e701/src/mistralai/extra/utils/response_format.py#L9 stream=True, http_headers={'User-Agent': get_user_agent()}, ) else: # Stream Mode response = await self.client.chat.stream_async( model=str(self._model_name), messages=mistral_messages, stream=True, http_headers={'User-Agent': get_user_agent()}, ) assert response, 'A unexpected empty response from Mistral.' return response def _get_tool_choice(self, model_request_parameters: ModelRequestParameters) -> MistralToolChoiceEnum | None: """Get tool choice for the model. - "auto": Default mode. Model decides if it uses the tool or not. - "any": Select any tool. - "none": Prevents tool use. - "required": Forces tool use. """ if not model_request_parameters.function_tools and not model_request_parameters.output_tools: return None elif not model_request_parameters.allow_text_output: return 'required' else: return 'auto' def _map_function_and_output_tools_definition( self, model_request_parameters: ModelRequestParameters ) -> list[MistralTool] | None: """Map function and output tools to MistralTool format. Returns None if both function_tools and output_tools are empty. """ tools = [ MistralTool( function=MistralFunction( name=r.name, parameters=r.parameters_json_schema, description=r.description or '' ) ) for r in model_request_parameters.tool_defs.values() ] return tools if tools else None def _process_response(self, response: MistralChatCompletionResponse) -> ModelResponse: """Process a non-streamed response, and prepare a message to return.""" assert response.choices, 'Unexpected empty response choice.' if response.created: timestamp = number_to_datetime(response.created) else: timestamp = _now_utc() choice = response.choices[0] content = choice.message.content tool_calls = choice.message.tool_calls parts: list[ModelResponsePart] = [] text, thinking = _map_content(content) for thought in thinking: parts.append(ThinkingPart(content=thought)) if text: parts.append(TextPart(content=text)) if isinstance(tool_calls, list): for tool_call in tool_calls: tool = self._map_mistral_to_pydantic_tool_call(tool_call=tool_call) parts.append(tool) raw_finish_reason = choice.finish_reason provider_details = {'finish_reason': raw_finish_reason} finish_reason = _FINISH_REASON_MAP.get(raw_finish_reason) return ModelResponse( parts=parts, usage=_map_usage(response), model_name=response.model, timestamp=timestamp, provider_response_id=response.id, provider_name=self._provider.name, finish_reason=finish_reason, provider_details=provider_details, ) async def _process_streamed_response( self, response: MistralEventStreamAsync[MistralCompletionEvent], model_request_parameters: ModelRequestParameters, ) -> StreamedResponse: """Process a streamed response, and prepare a streaming response to return.""" peekable_response = _utils.PeekableAsyncStream(response) first_chunk = await peekable_response.peek() if isinstance(first_chunk, _utils.Unset): raise UnexpectedModelBehavior( # pragma: no cover 'Streamed response ended without content or tool calls' ) if first_chunk.data.created: timestamp = number_to_datetime(first_chunk.data.created) else: timestamp = _now_utc() return MistralStreamedResponse( model_request_parameters=model_request_parameters, _response=peekable_response, _model_name=first_chunk.data.model, _timestamp=timestamp, _provider_name=self._provider.name, ) @staticmethod def _map_mistral_to_pydantic_tool_call(tool_call: MistralToolCall) -> ToolCallPart: """Maps a MistralToolCall to a ToolCall.""" tool_call_id = tool_call.id or _generate_tool_call_id() func_call = tool_call.function return ToolCallPart(func_call.name, func_call.arguments, tool_call_id) @staticmethod def _map_tool_call(t: ToolCallPart) -> MistralToolCall: """Maps a pydantic-ai ToolCall to a MistralToolCall.""" return MistralToolCall( id=_utils.guard_tool_call_id(t=t), type='function', function=MistralFunctionCall(name=t.tool_name, arguments=t.args or {}), ) def _generate_user_output_format(self, schemas: list[dict[str, Any]]) -> MistralUserMessage: """Get a message with an example of the expected output format.""" examples: list[dict[str, Any]] = [] for schema in schemas: typed_dict_definition: dict[str, Any] = {} for key, value in schema.get('properties', {}).items(): typed_dict_definition[key] = self._get_python_type(value) examples.append(typed_dict_definition) example_schema = examples[0] if len(examples) == 1 else examples return MistralUserMessage(content=self.json_mode_schema_prompt.format(schema=example_schema)) @classmethod def _get_python_type(cls, value: dict[str, Any]) -> str: """Return a string representation of the Python type for a single JSON schema property. This function handles recursion for nested arrays/objects and `anyOf`. """ # 1) Handle anyOf first, because it's a different schema structure if any_of := value.get('anyOf'): # Simplistic approach: pick the first option in anyOf # (In reality, you'd possibly want to merge or union types) return f'Optional[{cls._get_python_type(any_of[0])}]' # 2) If we have a top-level "type" field value_type = value.get('type') if not value_type: # No explicit type; fallback return 'Any' # 3) Direct simple type mapping (string, integer, float, bool, None) if value_type in SIMPLE_JSON_TYPE_MAPPING and value_type != 'array' and value_type != 'object': return SIMPLE_JSON_TYPE_MAPPING[value_type] # 4) Array: Recursively get the item type if value_type == 'array': items = value.get('items', {}) return f'list[{cls._get_python_type(items)}]' # 5) Object: Check for additionalProperties if value_type == 'object': additional_properties = value.get('additionalProperties', {}) if isinstance(additional_properties, bool): return 'bool' # pragma: lax no cover additional_properties_type = additional_properties.get('type') if ( additional_properties_type in SIMPLE_JSON_TYPE_MAPPING and additional_properties_type != 'array' and additional_properties_type != 'object' ): # dict[str, bool/int/float/etc...] return f'dict[str, {SIMPLE_JSON_TYPE_MAPPING[additional_properties_type]}]' elif additional_properties_type == 'array': array_items = additional_properties.get('items', {}) return f'dict[str, list[{cls._get_python_type(array_items)}]]' elif additional_properties_type == 'object': # nested dictionary of unknown shape return 'dict[str, dict[str, Any]]' else: # If no additionalProperties type or something else, default to a generic dict return 'dict[str, Any]' # 6) Fallback return 'Any' @staticmethod def _get_timeout_ms(timeout: Timeout | float | None) -> int | None: """Convert a timeout to milliseconds.""" if timeout is None: return None if isinstance(timeout, float): # pragma: no cover return int(1000 * timeout) raise NotImplementedError('Timeout object is not yet supported for MistralModel.') def _map_user_message(self, message: ModelRequest) -> Iterable[MistralMessages]: for part in message.parts: if isinstance(part, SystemPromptPart): yield MistralSystemMessage(content=part.content) elif isinstance(part, UserPromptPart): yield self._map_user_prompt(part) elif isinstance(part, ToolReturnPart): yield MistralToolMessage( tool_call_id=part.tool_call_id, content=part.model_response_str(), ) elif isinstance(part, RetryPromptPart): if part.tool_name is None: yield MistralUserMessage(content=part.model_response()) # pragma: no cover else: yield MistralToolMessage( tool_call_id=part.tool_call_id, content=part.model_response(), ) else: assert_never(part) def _map_messages(self, messages: list[ModelMessage]) -> list[MistralMessages]: """Just maps a `pydantic_ai.Message` to a `MistralMessage`.""" mistral_messages: list[MistralMessages] = [] for message in messages: if isinstance(message, ModelRequest): mistral_messages.extend(self._map_user_message(message)) elif isinstance(message, ModelResponse): content_chunks: list[MistralContentChunk] = [] thinking_chunks: list[MistralTextChunk | MistralReferenceChunk] = [] tool_calls: list[MistralToolCall] = [] for part in message.parts: if isinstance(part, TextPart): content_chunks.append(MistralTextChunk(text=part.content)) elif isinstance(part, ThinkingPart): thinking_chunks.append(MistralTextChunk(text=part.content)) elif isinstance(part, ToolCallPart): tool_calls.append(self._map_tool_call(part)) elif isinstance(part, BuiltinToolCallPart | BuiltinToolReturnPart): # pragma: no cover # This is currently never returned from mistral pass else: assert_never(part) if thinking_chunks: content_chunks.insert(0, MistralThinkChunk(thinking=thinking_chunks)) mistral_messages.append(MistralAssistantMessage(content=content_chunks, tool_calls=tool_calls)) else: assert_never(message) if instructions := self._get_instructions(messages): mistral_messages.insert(0, MistralSystemMessage(content=instructions)) # Post-process messages to insert fake assistant message after tool message if followed by user message # to work around `Unexpected role 'user' after role 'tool'` error. processed_messages: list[MistralMessages] = [] for i, current_message in enumerate(mistral_messages): processed_messages.append(current_message) if isinstance(current_message, MistralToolMessage) and i + 1 < len(mistral_messages): next_message = mistral_messages[i + 1] if isinstance(next_message, MistralUserMessage): # Insert a dummy assistant message processed_messages.append(MistralAssistantMessage(content=[MistralTextChunk(text='OK')])) return processed_messages def _map_user_prompt(self, part: UserPromptPart) -> MistralUserMessage: content: str | list[MistralContentChunk] if isinstance(part.content, str): content = part.content else: content = [] for item in part.content: if isinstance(item, str): content.append(MistralTextChunk(text=item)) elif isinstance(item, ImageUrl): content.append(MistralImageURLChunk(image_url=MistralImageURL(url=item.url))) elif isinstance(item, BinaryContent): base64_encoded = base64.b64encode(item.data).decode('utf-8') if item.is_image: image_url = MistralImageURL(url=f'data:{item.media_type};base64,{base64_encoded}') content.append(MistralImageURLChunk(image_url=image_url, type='image_url')) elif item.media_type == 'application/pdf': content.append( MistralDocumentURLChunk( document_url=f'data:application/pdf;base64,{base64_encoded}', type='document_url' ) ) else: raise RuntimeError('BinaryContent other than image or PDF is not supported in Mistral.') elif isinstance(item, DocumentUrl): if item.media_type == 'application/pdf': content.append(MistralDocumentURLChunk(document_url=item.url, type='document_url')) else: raise RuntimeError('DocumentUrl other than PDF is not supported in Mistral.') elif isinstance(item, VideoUrl): raise RuntimeError('VideoUrl is not supported in Mistral.') else: # pragma: no cover raise RuntimeError(f'Unsupported content type: {type(item)}') return MistralUserMessage(content=content) ```` #### __init__ ````python __init__( model_name: MistralModelName, *, provider: ( Literal["mistral"] | Provider[Mistral] ) = "mistral", profile: ModelProfileSpec | None = None, json_mode_schema_prompt: str = "Answer in JSON Object, respect the format:\n```\n{schema}\n```\n", settings: ModelSettings | None = None ) ```` Initialize a Mistral model. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `model_name` | `MistralModelName` | The name of the model to use. | *required* | | `provider` | `Literal['mistral'] | Provider[Mistral]` | The provider to use for authentication and API access. Can be either the string 'mistral' or an instance of Provider[Mistral]. If not provided, a new provider will be created using the other parameters. | `'mistral'` | | `profile` | `ModelProfileSpec | None` | The model profile to use. Defaults to a profile picked by the provider based on the model name. | `None` | | `json_mode_schema_prompt` | `str` | The prompt to show when the model expects a JSON object as input. | ```` 'Answer in JSON Object, respect the format:\n```\n{schema}\n```\n' ```` | | `settings` | `ModelSettings | None` | Model-specific settings that will be used as defaults for this model. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/models/mistral.py` ````python def __init__( self, model_name: MistralModelName, *, provider: Literal['mistral'] | Provider[Mistral] = 'mistral', profile: ModelProfileSpec | None = None, json_mode_schema_prompt: str = """Answer in JSON Object, respect the format:\n```\n{schema}\n```\n""", settings: ModelSettings | None = None, ): """Initialize a Mistral model. Args: model_name: The name of the model to use. provider: The provider to use for authentication and API access. Can be either the string 'mistral' or an instance of `Provider[Mistral]`. If not provided, a new provider will be created using the other parameters. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. json_mode_schema_prompt: The prompt to show when the model expects a JSON object as input. settings: Model-specific settings that will be used as defaults for this model. """ self._model_name = model_name self.json_mode_schema_prompt = json_mode_schema_prompt if isinstance(provider, str): provider = infer_provider(provider) self._provider = provider self.client = provider.client super().__init__(settings=settings, profile=profile or provider.model_profile) ```` #### model_name ```python model_name: MistralModelName ``` The model name. #### system ```python system: str ``` The model provider. #### request ```python request( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse ``` Make a non-streaming request to the model from Pydantic AI call. Source code in `pydantic_ai_slim/pydantic_ai/models/mistral.py` ```python async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: """Make a non-streaming request to the model from Pydantic AI call.""" check_allow_model_requests() response = await self._completions_create( messages, cast(MistralModelSettings, model_settings or {}), model_request_parameters ) model_response = self._process_response(response) return model_response ``` #### request_stream ```python request_stream( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse] ``` Make a streaming request to the model from Pydantic AI call. Source code in `pydantic_ai_slim/pydantic_ai/models/mistral.py` ```python @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: """Make a streaming request to the model from Pydantic AI call.""" check_allow_model_requests() response = await self._stream_completions_create( messages, cast(MistralModelSettings, model_settings or {}), model_request_parameters ) async with response: yield await self._process_streamed_response(response, model_request_parameters) ``` ### MistralStreamedResponse Bases: `StreamedResponse` Implementation of `StreamedResponse` for Mistral models. Source code in `pydantic_ai_slim/pydantic_ai/models/mistral.py` ```python @dataclass class MistralStreamedResponse(StreamedResponse): """Implementation of `StreamedResponse` for Mistral models.""" _model_name: MistralModelName _response: AsyncIterable[MistralCompletionEvent] _timestamp: datetime _provider_name: str _delta_content: str = field(default='', init=False) async def _get_event_iterator(self) -> AsyncIterator[ModelResponseStreamEvent]: chunk: MistralCompletionEvent async for chunk in self._response: self._usage += _map_usage(chunk.data) if chunk.data.id: # pragma: no branch self.provider_response_id = chunk.data.id try: choice = chunk.data.choices[0] except IndexError: continue if raw_finish_reason := choice.finish_reason: self.provider_details = {'finish_reason': raw_finish_reason} self.finish_reason = _FINISH_REASON_MAP.get(raw_finish_reason) # Handle the text part of the response content = choice.delta.content text, thinking = _map_content(content) for thought in thinking: self._parts_manager.handle_thinking_delta(vendor_part_id='thinking', content=thought) if text: # Attempt to produce an output tool call from the received text output_tools = {c.name: c for c in self.model_request_parameters.output_tools} if output_tools: self._delta_content += text # TODO: Port to native "manual JSON" mode maybe_tool_call_part = self._try_get_output_tool_from_text(self._delta_content, output_tools) if maybe_tool_call_part: yield self._parts_manager.handle_tool_call_part( vendor_part_id='output', tool_name=maybe_tool_call_part.tool_name, args=maybe_tool_call_part.args_as_dict(), tool_call_id=maybe_tool_call_part.tool_call_id, ) else: maybe_event = self._parts_manager.handle_text_delta(vendor_part_id='content', content=text) if maybe_event is not None: # pragma: no branch yield maybe_event # Handle the explicit tool calls for index, dtc in enumerate(choice.delta.tool_calls or []): # It seems that mistral just sends full tool calls, so we just use them directly, rather than building yield self._parts_manager.handle_tool_call_part( vendor_part_id=index, tool_name=dtc.function.name, args=dtc.function.arguments, tool_call_id=dtc.id ) @property def model_name(self) -> MistralModelName: """Get the model name of the response.""" return self._model_name @property def provider_name(self) -> str: """Get the provider name.""" return self._provider_name @property def timestamp(self) -> datetime: """Get the timestamp of the response.""" return self._timestamp @staticmethod def _try_get_output_tool_from_text(text: str, output_tools: dict[str, ToolDefinition]) -> ToolCallPart | None: output_json: dict[str, Any] | None = pydantic_core.from_json(text, allow_partial='trailing-strings') if output_json: for output_tool in output_tools.values(): # NOTE: Additional verification to prevent JSON validation to crash # Ensures required parameters in the JSON schema are respected, especially for stream-based return types. # Example with BaseModel and required fields. if not MistralStreamedResponse._validate_required_json_schema( output_json, output_tool.parameters_json_schema ): continue # The following part_id will be thrown away return ToolCallPart(tool_name=output_tool.name, args=output_json) @staticmethod def _validate_required_json_schema(json_dict: dict[str, Any], json_schema: dict[str, Any]) -> bool: """Validate that all required parameters in the JSON schema are present in the JSON dictionary.""" required_params = json_schema.get('required', []) properties = json_schema.get('properties', {}) for param in required_params: if param not in json_dict: return False param_schema = properties.get(param, {}) param_type = param_schema.get('type') param_items_type = param_schema.get('items', {}).get('type') if param_type == 'array' and param_items_type: if not isinstance(json_dict[param], list): return False for item in json_dict[param]: if not isinstance(item, VALID_JSON_TYPE_MAPPING[param_items_type]): return False elif param_type and not isinstance(json_dict[param], VALID_JSON_TYPE_MAPPING[param_type]): return False if isinstance(json_dict[param], dict) and 'properties' in param_schema: nested_schema = param_schema if not MistralStreamedResponse._validate_required_json_schema(json_dict[param], nested_schema): return False return True ``` #### model_name ```python model_name: MistralModelName ``` Get the model name of the response. #### provider_name ```python provider_name: str ``` Get the provider name. #### timestamp ```python timestamp: datetime ``` Get the timestamp of the response. # `pydantic_ai.models.openai` ## Setup For details on how to set up authentication with this model, see [model configuration for OpenAI](../../../models/openai/). ### OpenAIModelName ```python OpenAIModelName = str | AllModels ``` Possible OpenAI model names. Since OpenAI supports a variety of date-stamped models, we explicitly list the latest models but allow any name in the type hints. See [the OpenAI docs](https://platform.openai.com/docs/models) for a full list. Using this more broad type for the model name instead of the ChatModel definition allows this model to be used more easily with other model types (ie, Ollama, Deepseek). ### OpenAIChatModelSettings Bases: `ModelSettings` Settings used for an OpenAI model request. Source code in `pydantic_ai_slim/pydantic_ai/models/openai.py` ```python class OpenAIChatModelSettings(ModelSettings, total=False): """Settings used for an OpenAI model request.""" # ALL FIELDS MUST BE `openai_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. openai_reasoning_effort: ReasoningEffort """Constrains effort on reasoning for [reasoning models](https://platform.openai.com/docs/guides/reasoning). Currently supported values are `low`, `medium`, and `high`. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response. """ openai_logprobs: bool """Include log probabilities in the response.""" openai_top_logprobs: int """Include log probabilities of the top n tokens in the response.""" openai_user: str """A unique identifier representing the end-user, which can help OpenAI monitor and detect abuse. See [OpenAI's safety best practices](https://platform.openai.com/docs/guides/safety-best-practices#end-user-ids) for more details. """ openai_service_tier: Literal['auto', 'default', 'flex', 'priority'] """The service tier to use for the model request. Currently supported values are `auto`, `default`, `flex`, and `priority`. For more information, see [OpenAI's service tiers documentation](https://platform.openai.com/docs/api-reference/chat/object#chat/object-service_tier). """ openai_prediction: ChatCompletionPredictionContentParam """Enables [predictive outputs](https://platform.openai.com/docs/guides/predicted-outputs). This feature is currently only supported for some OpenAI models. """ ``` #### openai_reasoning_effort ```python openai_reasoning_effort: ReasoningEffort ``` Constrains effort on reasoning for [reasoning models](https://platform.openai.com/docs/guides/reasoning). Currently supported values are `low`, `medium`, and `high`. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response. #### openai_logprobs ```python openai_logprobs: bool ``` Include log probabilities in the response. #### openai_top_logprobs ```python openai_top_logprobs: int ``` Include log probabilities of the top n tokens in the response. #### openai_user ```python openai_user: str ``` A unique identifier representing the end-user, which can help OpenAI monitor and detect abuse. See [OpenAI's safety best practices](https://platform.openai.com/docs/guides/safety-best-practices#end-user-ids) for more details. #### openai_service_tier ```python openai_service_tier: Literal[ "auto", "default", "flex", "priority" ] ``` The service tier to use for the model request. Currently supported values are `auto`, `default`, `flex`, and `priority`. For more information, see [OpenAI's service tiers documentation](https://platform.openai.com/docs/api-reference/chat/object#chat/object-service_tier). #### openai_prediction ```python openai_prediction: ChatCompletionPredictionContentParam ``` Enables [predictive outputs](https://platform.openai.com/docs/guides/predicted-outputs). This feature is currently only supported for some OpenAI models. ### OpenAIModelSettings Bases: `OpenAIChatModelSettings` Deprecated Use `OpenAIChatModelSettings` instead. Deprecated alias for `OpenAIChatModelSettings`. Source code in `pydantic_ai_slim/pydantic_ai/models/openai.py` ```python @deprecated('Use `OpenAIChatModelSettings` instead.') class OpenAIModelSettings(OpenAIChatModelSettings, total=False): """Deprecated alias for `OpenAIChatModelSettings`.""" ``` ### OpenAIResponsesModelSettings Bases: `OpenAIChatModelSettings` Settings used for an OpenAI Responses model request. ALL FIELDS MUST BE `openai_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. Source code in `pydantic_ai_slim/pydantic_ai/models/openai.py` ```python class OpenAIResponsesModelSettings(OpenAIChatModelSettings, total=False): """Settings used for an OpenAI Responses model request. ALL FIELDS MUST BE `openai_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. """ openai_builtin_tools: Sequence[FileSearchToolParam | WebSearchToolParam | ComputerToolParam] """The provided OpenAI built-in tools to use. See [OpenAI's built-in tools](https://platform.openai.com/docs/guides/tools?api-mode=responses) for more details. """ openai_reasoning_generate_summary: Literal['detailed', 'concise'] """Deprecated alias for `openai_reasoning_summary`.""" openai_reasoning_summary: Literal['detailed', 'concise'] """A summary of the reasoning performed by the model. This can be useful for debugging and understanding the model's reasoning process. One of `concise` or `detailed`. Check the [OpenAI Reasoning documentation](https://platform.openai.com/docs/guides/reasoning?api-mode=responses#reasoning-summaries) for more details. """ openai_send_reasoning_ids: bool """Whether to send reasoning IDs from the message history to the model. Enabled by default. This can result in errors like `"Item 'rs_123' of type 'reasoning' was provided without its required following item."` if the message history you're sending does not match exactly what was received from the Responses API in a previous response, for example if you're using a [history processor](../../message-history.md#processing-message-history). In that case, you'll want to disable this. """ openai_truncation: Literal['disabled', 'auto'] """The truncation strategy to use for the model response. It can be either: - `disabled` (default): If a model response will exceed the context window size for a model, the request will fail with a 400 error. - `auto`: If the context of this response and previous ones exceeds the model's context window size, the model will truncate the response to fit the context window by dropping input items in the middle of the conversation. """ openai_text_verbosity: Literal['low', 'medium', 'high'] """Constrains the verbosity of the model's text response. Lower values will result in more concise responses, while higher values will result in more verbose responses. Currently supported values are `low`, `medium`, and `high`. """ openai_previous_response_id: Literal['auto'] | str """The ID of a previous response from the model to use as the starting point for a continued conversation. When set to `'auto'`, the request automatically uses the most recent `provider_response_id` from the message history and omits earlier messages. This enables the model to use server-side conversation state and faithfully reference previous reasoning. See the [OpenAI Responses API documentation](https://platform.openai.com/docs/guides/reasoning#keeping-reasoning-items-in-context) for more information. """ ``` #### openai_builtin_tools ```python openai_builtin_tools: Sequence[ FileSearchToolParam | WebSearchToolParam | ComputerToolParam ] ``` The provided OpenAI built-in tools to use. See [OpenAI's built-in tools](https://platform.openai.com/docs/guides/tools?api-mode=responses) for more details. #### openai_reasoning_generate_summary ```python openai_reasoning_generate_summary: Literal[ "detailed", "concise" ] ``` Deprecated alias for `openai_reasoning_summary`. #### openai_reasoning_summary ```python openai_reasoning_summary: Literal['detailed', 'concise'] ``` A summary of the reasoning performed by the model. This can be useful for debugging and understanding the model's reasoning process. One of `concise` or `detailed`. Check the [OpenAI Reasoning documentation](https://platform.openai.com/docs/guides/reasoning?api-mode=responses#reasoning-summaries) for more details. #### openai_send_reasoning_ids ```python openai_send_reasoning_ids: bool ``` Whether to send reasoning IDs from the message history to the model. Enabled by default. This can result in errors like `"Item 'rs_123' of type 'reasoning' was provided without its required following item."` if the message history you're sending does not match exactly what was received from the Responses API in a previous response, for example if you're using a [history processor](../../../message-history/#processing-message-history). In that case, you'll want to disable this. #### openai_truncation ```python openai_truncation: Literal['disabled', 'auto'] ``` The truncation strategy to use for the model response. It can be either: - `disabled` (default): If a model response will exceed the context window size for a model, the request will fail with a 400 error. - `auto`: If the context of this response and previous ones exceeds the model's context window size, the model will truncate the response to fit the context window by dropping input items in the middle of the conversation. #### openai_text_verbosity ```python openai_text_verbosity: Literal['low', 'medium', 'high'] ``` Constrains the verbosity of the model's text response. Lower values will result in more concise responses, while higher values will result in more verbose responses. Currently supported values are `low`, `medium`, and `high`. #### openai_previous_response_id ```python openai_previous_response_id: Literal['auto'] | str ``` The ID of a previous response from the model to use as the starting point for a continued conversation. When set to `'auto'`, the request automatically uses the most recent `provider_response_id` from the message history and omits earlier messages. This enables the model to use server-side conversation state and faithfully reference previous reasoning. See the [OpenAI Responses API documentation](https://platform.openai.com/docs/guides/reasoning#keeping-reasoning-items-in-context) for more information. ### OpenAIChatModel Bases: `Model` A model that uses the OpenAI API. Internally, this uses the [OpenAI Python client](https://github.com/openai/openai-python) to interact with the API. Apart from `__init__`, all methods are private or match those of the base class. Source code in `pydantic_ai_slim/pydantic_ai/models/openai.py` ```python @dataclass(init=False) class OpenAIChatModel(Model): """A model that uses the OpenAI API. Internally, this uses the [OpenAI Python client](https://github.com/openai/openai-python) to interact with the API. Apart from `__init__`, all methods are private or match those of the base class. """ client: AsyncOpenAI = field(repr=False) _model_name: OpenAIModelName = field(repr=False) _provider: Provider[AsyncOpenAI] = field(repr=False) @overload def __init__( self, model_name: OpenAIModelName, *, provider: Literal[ 'azure', 'deepseek', 'cerebras', 'fireworks', 'github', 'grok', 'heroku', 'moonshotai', 'ollama', 'openai', 'openai-chat', 'openrouter', 'together', 'vercel', 'litellm', ] | Provider[AsyncOpenAI] = 'openai', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ) -> None: ... @deprecated('Set the `system_prompt_role` in the `OpenAIModelProfile` instead.') @overload def __init__( self, model_name: OpenAIModelName, *, provider: Literal[ 'azure', 'deepseek', 'cerebras', 'fireworks', 'github', 'grok', 'heroku', 'moonshotai', 'ollama', 'openai', 'openai-chat', 'openrouter', 'together', 'vercel', 'litellm', ] | Provider[AsyncOpenAI] = 'openai', profile: ModelProfileSpec | None = None, system_prompt_role: OpenAISystemPromptRole | None = None, settings: ModelSettings | None = None, ) -> None: ... def __init__( self, model_name: OpenAIModelName, *, provider: Literal[ 'azure', 'deepseek', 'cerebras', 'fireworks', 'github', 'grok', 'heroku', 'moonshotai', 'ollama', 'openai', 'openai-chat', 'openrouter', 'together', 'vercel', 'litellm', ] | Provider[AsyncOpenAI] = 'openai', profile: ModelProfileSpec | None = None, system_prompt_role: OpenAISystemPromptRole | None = None, settings: ModelSettings | None = None, ): """Initialize an OpenAI model. Args: model_name: The name of the OpenAI model to use. List of model names available [here](https://github.com/openai/openai-python/blob/v1.54.3/src/openai/types/chat_model.py#L7) (Unfortunately, despite being ask to do so, OpenAI do not provide `.inv` files for their API). provider: The provider to use. Defaults to `'openai'`. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. system_prompt_role: The role to use for the system prompt message. If not provided, defaults to `'system'`. In the future, this may be inferred from the model name. settings: Default model settings for this model instance. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider(provider) self._provider = provider self.client = provider.client super().__init__(settings=settings, profile=profile or provider.model_profile) if system_prompt_role is not None: self.profile = OpenAIModelProfile(openai_system_prompt_role=system_prompt_role).update(self.profile) @property def base_url(self) -> str: return str(self.client.base_url) @property def model_name(self) -> OpenAIModelName: """The model name.""" return self._model_name @property def system(self) -> str: """The model provider.""" return self._provider.name @property @deprecated('Set the `system_prompt_role` in the `OpenAIModelProfile` instead.') def system_prompt_role(self) -> OpenAISystemPromptRole | None: return OpenAIModelProfile.from_profile(self.profile).openai_system_prompt_role async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: check_allow_model_requests() response = await self._completions_create( messages, False, cast(OpenAIChatModelSettings, model_settings or {}), model_request_parameters ) model_response = self._process_response(response) return model_response @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: check_allow_model_requests() response = await self._completions_create( messages, True, cast(OpenAIChatModelSettings, model_settings or {}), model_request_parameters ) async with response: yield await self._process_streamed_response(response, model_request_parameters) @overload async def _completions_create( self, messages: list[ModelMessage], stream: Literal[True], model_settings: OpenAIChatModelSettings, model_request_parameters: ModelRequestParameters, ) -> AsyncStream[ChatCompletionChunk]: ... @overload async def _completions_create( self, messages: list[ModelMessage], stream: Literal[False], model_settings: OpenAIChatModelSettings, model_request_parameters: ModelRequestParameters, ) -> chat.ChatCompletion: ... async def _completions_create( self, messages: list[ModelMessage], stream: bool, model_settings: OpenAIChatModelSettings, model_request_parameters: ModelRequestParameters, ) -> chat.ChatCompletion | AsyncStream[ChatCompletionChunk]: tools = self._get_tools(model_request_parameters) web_search_options = self._get_web_search_options(model_request_parameters) if not tools: tool_choice: Literal['none', 'required', 'auto'] | None = None elif ( not model_request_parameters.allow_text_output and OpenAIModelProfile.from_profile(self.profile).openai_supports_tool_choice_required ): tool_choice = 'required' else: tool_choice = 'auto' openai_messages = await self._map_messages(messages) response_format: chat.completion_create_params.ResponseFormat | None = None if model_request_parameters.output_mode == 'native': output_object = model_request_parameters.output_object assert output_object is not None response_format = self._map_json_schema(output_object) elif ( model_request_parameters.output_mode == 'prompted' and self.profile.supports_json_object_output ): # pragma: no branch response_format = {'type': 'json_object'} unsupported_model_settings = OpenAIModelProfile.from_profile(self.profile).openai_unsupported_model_settings for setting in unsupported_model_settings: model_settings.pop(setting, None) try: extra_headers = model_settings.get('extra_headers', {}) extra_headers.setdefault('User-Agent', get_user_agent()) return await self.client.chat.completions.create( model=self._model_name, messages=openai_messages, parallel_tool_calls=model_settings.get('parallel_tool_calls', NOT_GIVEN), tools=tools or NOT_GIVEN, tool_choice=tool_choice or NOT_GIVEN, stream=stream, stream_options={'include_usage': True} if stream else NOT_GIVEN, stop=model_settings.get('stop_sequences', NOT_GIVEN), max_completion_tokens=model_settings.get('max_tokens', NOT_GIVEN), timeout=model_settings.get('timeout', NOT_GIVEN), response_format=response_format or NOT_GIVEN, seed=model_settings.get('seed', NOT_GIVEN), reasoning_effort=model_settings.get('openai_reasoning_effort', NOT_GIVEN), user=model_settings.get('openai_user', NOT_GIVEN), web_search_options=web_search_options or NOT_GIVEN, service_tier=model_settings.get('openai_service_tier', NOT_GIVEN), prediction=model_settings.get('openai_prediction', NOT_GIVEN), temperature=model_settings.get('temperature', NOT_GIVEN), top_p=model_settings.get('top_p', NOT_GIVEN), presence_penalty=model_settings.get('presence_penalty', NOT_GIVEN), frequency_penalty=model_settings.get('frequency_penalty', NOT_GIVEN), logit_bias=model_settings.get('logit_bias', NOT_GIVEN), logprobs=model_settings.get('openai_logprobs', NOT_GIVEN), top_logprobs=model_settings.get('openai_top_logprobs', NOT_GIVEN), extra_headers=extra_headers, extra_body=model_settings.get('extra_body'), ) except APIStatusError as e: if (status_code := e.status_code) >= 400: raise ModelHTTPError(status_code=status_code, model_name=self.model_name, body=e.body) from e raise # pragma: lax no cover def _process_response(self, response: chat.ChatCompletion | str) -> ModelResponse: """Process a non-streamed response, and prepare a message to return.""" # Although the OpenAI SDK claims to return a Pydantic model (`ChatCompletion`) from the chat completions function: # * it hasn't actually performed validation (presumably they're creating the model with `model_construct` or something?!) # * if the endpoint returns plain text, the return type is a string # Thus we validate it fully here. if not isinstance(response, chat.ChatCompletion): raise UnexpectedModelBehavior('Invalid response from OpenAI chat completions endpoint, expected JSON data') if response.created: timestamp = number_to_datetime(response.created) else: timestamp = _now_utc() response.created = int(timestamp.timestamp()) try: response = chat.ChatCompletion.model_validate(response.model_dump()) except ValidationError as e: raise UnexpectedModelBehavior(f'Invalid response from OpenAI chat completions endpoint: {e}') from e choice = response.choices[0] items: list[ModelResponsePart] = [] # The `reasoning_content` field is only present in DeepSeek models. # https://api-docs.deepseek.com/guides/reasoning_model if reasoning_content := getattr(choice.message, 'reasoning_content', None): items.append(ThinkingPart(id='reasoning_content', content=reasoning_content, provider_name=self.system)) # NOTE: We don't currently handle OpenRouter `reasoning_details`: # - https://openrouter.ai/docs/use-cases/reasoning-tokens#preserving-reasoning-blocks # NOTE: We don't currently handle OpenRouter/gpt-oss `reasoning`: # - https://cookbook.openai.com/articles/gpt-oss/handle-raw-cot#chat-completions-api # - https://openrouter.ai/docs/use-cases/reasoning-tokens#basic-usage-with-reasoning-tokens # If you need this, please file an issue. vendor_details: dict[str, Any] = {} # Add logprobs to vendor_details if available if choice.logprobs is not None and choice.logprobs.content: # Convert logprobs to a serializable format vendor_details['logprobs'] = [ { 'token': lp.token, 'bytes': lp.bytes, 'logprob': lp.logprob, 'top_logprobs': [ {'token': tlp.token, 'bytes': tlp.bytes, 'logprob': tlp.logprob} for tlp in lp.top_logprobs ], } for lp in choice.logprobs.content ] if choice.message.content is not None: items.extend( (replace(part, id='content', provider_name=self.system) if isinstance(part, ThinkingPart) else part) for part in split_content_into_text_and_thinking(choice.message.content, self.profile.thinking_tags) ) if choice.message.tool_calls is not None: for c in choice.message.tool_calls: if isinstance(c, ChatCompletionMessageFunctionToolCall): part = ToolCallPart(c.function.name, c.function.arguments, tool_call_id=c.id) elif isinstance(c, ChatCompletionMessageCustomToolCall): # pragma: no cover # NOTE: Custom tool calls are not supported. # See for more details. raise RuntimeError('Custom tool calls are not supported') else: assert_never(c) part.tool_call_id = _guard_tool_call_id(part) items.append(part) raw_finish_reason = choice.finish_reason vendor_details['finish_reason'] = raw_finish_reason finish_reason = _CHAT_FINISH_REASON_MAP.get(raw_finish_reason) return ModelResponse( parts=items, usage=_map_usage(response), model_name=response.model, timestamp=timestamp, provider_details=vendor_details or None, provider_response_id=response.id, provider_name=self._provider.name, finish_reason=finish_reason, ) async def _process_streamed_response( self, response: AsyncStream[ChatCompletionChunk], model_request_parameters: ModelRequestParameters ) -> OpenAIStreamedResponse: """Process a streamed response, and prepare a streaming response to return.""" peekable_response = _utils.PeekableAsyncStream(response) first_chunk = await peekable_response.peek() if isinstance(first_chunk, _utils.Unset): raise UnexpectedModelBehavior( # pragma: no cover 'Streamed response ended without content or tool calls' ) return OpenAIStreamedResponse( model_request_parameters=model_request_parameters, _model_name=first_chunk.model, _model_profile=self.profile, _response=peekable_response, _timestamp=number_to_datetime(first_chunk.created), _provider_name=self._provider.name, ) def _get_tools(self, model_request_parameters: ModelRequestParameters) -> list[chat.ChatCompletionToolParam]: return [self._map_tool_definition(r) for r in model_request_parameters.tool_defs.values()] def _get_web_search_options(self, model_request_parameters: ModelRequestParameters) -> WebSearchOptions | None: for tool in model_request_parameters.builtin_tools: if isinstance(tool, WebSearchTool): # pragma: no branch if not OpenAIModelProfile.from_profile(self.profile).openai_chat_supports_web_search: raise UserError( f'WebSearchTool is not supported with `OpenAIChatModel` and model {self.model_name!r}. ' f'Please use `OpenAIResponsesModel` instead.' ) if tool.user_location: return WebSearchOptions( search_context_size=tool.search_context_size, user_location=WebSearchOptionsUserLocation( type='approximate', approximate=WebSearchOptionsUserLocationApproximate(**tool.user_location), ), ) return WebSearchOptions(search_context_size=tool.search_context_size) else: raise UserError( f'`{tool.__class__.__name__}` is not supported by `OpenAIChatModel`. If it should be, please file an issue.' ) async def _map_messages(self, messages: list[ModelMessage]) -> list[chat.ChatCompletionMessageParam]: """Just maps a `pydantic_ai.Message` to a `openai.types.ChatCompletionMessageParam`.""" openai_messages: list[chat.ChatCompletionMessageParam] = [] for message in messages: if isinstance(message, ModelRequest): async for item in self._map_user_message(message): openai_messages.append(item) elif isinstance(message, ModelResponse): texts: list[str] = [] tool_calls: list[ChatCompletionMessageFunctionToolCallParam] = [] for item in message.parts: if isinstance(item, TextPart): texts.append(item.content) elif isinstance(item, ThinkingPart): # NOTE: DeepSeek `reasoning_content` field should NOT be sent back per https://api-docs.deepseek.com/guides/reasoning_model, # but we currently just send it in `` tags anyway as we don't want DeepSeek-specific checks here. # If you need this changed, please file an issue. start_tag, end_tag = self.profile.thinking_tags texts.append('\n'.join([start_tag, item.content, end_tag])) elif isinstance(item, ToolCallPart): tool_calls.append(self._map_tool_call(item)) # OpenAI doesn't return built-in tool calls elif isinstance(item, BuiltinToolCallPart | BuiltinToolReturnPart): # pragma: no cover pass else: assert_never(item) message_param = chat.ChatCompletionAssistantMessageParam(role='assistant') if texts: # Note: model responses from this model should only have one text item, so the following # shouldn't merge multiple texts into one unless you switch models between runs: message_param['content'] = '\n\n'.join(texts) else: message_param['content'] = None if tool_calls: message_param['tool_calls'] = tool_calls openai_messages.append(message_param) else: assert_never(message) if instructions := self._get_instructions(messages): openai_messages.insert(0, chat.ChatCompletionSystemMessageParam(content=instructions, role='system')) return openai_messages @staticmethod def _map_tool_call(t: ToolCallPart) -> ChatCompletionMessageFunctionToolCallParam: return ChatCompletionMessageFunctionToolCallParam( id=_guard_tool_call_id(t=t), type='function', function={'name': t.tool_name, 'arguments': t.args_as_json_str()}, ) def _map_json_schema(self, o: OutputObjectDefinition) -> chat.completion_create_params.ResponseFormat: response_format_param: chat.completion_create_params.ResponseFormatJSONSchema = { # pyright: ignore[reportPrivateImportUsage] 'type': 'json_schema', 'json_schema': {'name': o.name or DEFAULT_OUTPUT_TOOL_NAME, 'schema': o.json_schema}, } if o.description: response_format_param['json_schema']['description'] = o.description if OpenAIModelProfile.from_profile(self.profile).openai_supports_strict_tool_definition: # pragma: no branch response_format_param['json_schema']['strict'] = o.strict return response_format_param def _map_tool_definition(self, f: ToolDefinition) -> chat.ChatCompletionToolParam: tool_param: chat.ChatCompletionToolParam = { 'type': 'function', 'function': { 'name': f.name, 'description': f.description or '', 'parameters': f.parameters_json_schema, }, } if f.strict and OpenAIModelProfile.from_profile(self.profile).openai_supports_strict_tool_definition: tool_param['function']['strict'] = f.strict return tool_param async def _map_user_message(self, message: ModelRequest) -> AsyncIterable[chat.ChatCompletionMessageParam]: for part in message.parts: if isinstance(part, SystemPromptPart): system_prompt_role = OpenAIModelProfile.from_profile(self.profile).openai_system_prompt_role if system_prompt_role == 'developer': yield chat.ChatCompletionDeveloperMessageParam(role='developer', content=part.content) elif system_prompt_role == 'user': yield chat.ChatCompletionUserMessageParam(role='user', content=part.content) else: yield chat.ChatCompletionSystemMessageParam(role='system', content=part.content) elif isinstance(part, UserPromptPart): yield await self._map_user_prompt(part) elif isinstance(part, ToolReturnPart): yield chat.ChatCompletionToolMessageParam( role='tool', tool_call_id=_guard_tool_call_id(t=part), content=part.model_response_str(), ) elif isinstance(part, RetryPromptPart): if part.tool_name is None: yield chat.ChatCompletionUserMessageParam(role='user', content=part.model_response()) else: yield chat.ChatCompletionToolMessageParam( role='tool', tool_call_id=_guard_tool_call_id(t=part), content=part.model_response(), ) else: assert_never(part) @staticmethod async def _map_user_prompt(part: UserPromptPart) -> chat.ChatCompletionUserMessageParam: content: str | list[ChatCompletionContentPartParam] if isinstance(part.content, str): content = part.content else: content = [] for item in part.content: if isinstance(item, str): content.append(ChatCompletionContentPartTextParam(text=item, type='text')) elif isinstance(item, ImageUrl): image_url = ImageURL(url=item.url) content.append(ChatCompletionContentPartImageParam(image_url=image_url, type='image_url')) elif isinstance(item, BinaryContent): base64_encoded = base64.b64encode(item.data).decode('utf-8') if item.is_image: image_url = ImageURL(url=f'data:{item.media_type};base64,{base64_encoded}') content.append(ChatCompletionContentPartImageParam(image_url=image_url, type='image_url')) elif item.is_audio: assert item.format in ('wav', 'mp3') audio = InputAudio(data=base64_encoded, format=item.format) content.append(ChatCompletionContentPartInputAudioParam(input_audio=audio, type='input_audio')) elif item.is_document: content.append( File( file=FileFile( file_data=f'data:{item.media_type};base64,{base64_encoded}', filename=f'filename.{item.format}', ), type='file', ) ) else: # pragma: no cover raise RuntimeError(f'Unsupported binary content type: {item.media_type}') elif isinstance(item, AudioUrl): downloaded_item = await download_item(item, data_format='base64', type_format='extension') assert downloaded_item['data_type'] in ( 'wav', 'mp3', ), f'Unsupported audio format: {downloaded_item["data_type"]}' audio = InputAudio(data=downloaded_item['data'], format=downloaded_item['data_type']) content.append(ChatCompletionContentPartInputAudioParam(input_audio=audio, type='input_audio')) elif isinstance(item, DocumentUrl): downloaded_item = await download_item(item, data_format='base64_uri', type_format='extension') file = File( file=FileFile( file_data=downloaded_item['data'], filename=f'filename.{downloaded_item["data_type"]}' ), type='file', ) content.append(file) elif isinstance(item, VideoUrl): # pragma: no cover raise NotImplementedError('VideoUrl is not supported for OpenAI') else: assert_never(item) return chat.ChatCompletionUserMessageParam(role='user', content=content) ``` #### __init__ ```python __init__( model_name: OpenAIModelName, *, provider: ( Literal[ "azure", "deepseek", "cerebras", "fireworks", "github", "grok", "heroku", "moonshotai", "ollama", "openai", "openai-chat", "openrouter", "together", "vercel", "litellm", ] | Provider[AsyncOpenAI] ) = "openai", profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None ) -> None ``` ```python __init__( model_name: OpenAIModelName, *, provider: ( Literal[ "azure", "deepseek", "cerebras", "fireworks", "github", "grok", "heroku", "moonshotai", "ollama", "openai", "openai-chat", "openrouter", "together", "vercel", "litellm", ] | Provider[AsyncOpenAI] ) = "openai", profile: ModelProfileSpec | None = None, system_prompt_role: ( OpenAISystemPromptRole | None ) = None, settings: ModelSettings | None = None ) -> None ``` ```python __init__( model_name: OpenAIModelName, *, provider: ( Literal[ "azure", "deepseek", "cerebras", "fireworks", "github", "grok", "heroku", "moonshotai", "ollama", "openai", "openai-chat", "openrouter", "together", "vercel", "litellm", ] | Provider[AsyncOpenAI] ) = "openai", profile: ModelProfileSpec | None = None, system_prompt_role: ( OpenAISystemPromptRole | None ) = None, settings: ModelSettings | None = None ) ``` Initialize an OpenAI model. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `model_name` | `OpenAIModelName` | The name of the OpenAI model to use. List of model names available here (Unfortunately, despite being ask to do so, OpenAI do not provide .inv files for their API). | *required* | | `provider` | `Literal['azure', 'deepseek', 'cerebras', 'fireworks', 'github', 'grok', 'heroku', 'moonshotai', 'ollama', 'openai', 'openai-chat', 'openrouter', 'together', 'vercel', 'litellm'] | Provider[AsyncOpenAI]` | The provider to use. Defaults to 'openai'. | `'openai'` | | `profile` | `ModelProfileSpec | None` | The model profile to use. Defaults to a profile picked by the provider based on the model name. | `None` | | `system_prompt_role` | `OpenAISystemPromptRole | None` | The role to use for the system prompt message. If not provided, defaults to 'system'. In the future, this may be inferred from the model name. | `None` | | `settings` | `ModelSettings | None` | Default model settings for this model instance. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/models/openai.py` ```python def __init__( self, model_name: OpenAIModelName, *, provider: Literal[ 'azure', 'deepseek', 'cerebras', 'fireworks', 'github', 'grok', 'heroku', 'moonshotai', 'ollama', 'openai', 'openai-chat', 'openrouter', 'together', 'vercel', 'litellm', ] | Provider[AsyncOpenAI] = 'openai', profile: ModelProfileSpec | None = None, system_prompt_role: OpenAISystemPromptRole | None = None, settings: ModelSettings | None = None, ): """Initialize an OpenAI model. Args: model_name: The name of the OpenAI model to use. List of model names available [here](https://github.com/openai/openai-python/blob/v1.54.3/src/openai/types/chat_model.py#L7) (Unfortunately, despite being ask to do so, OpenAI do not provide `.inv` files for their API). provider: The provider to use. Defaults to `'openai'`. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. system_prompt_role: The role to use for the system prompt message. If not provided, defaults to `'system'`. In the future, this may be inferred from the model name. settings: Default model settings for this model instance. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider(provider) self._provider = provider self.client = provider.client super().__init__(settings=settings, profile=profile or provider.model_profile) if system_prompt_role is not None: self.profile = OpenAIModelProfile(openai_system_prompt_role=system_prompt_role).update(self.profile) ``` #### model_name ```python model_name: OpenAIModelName ``` The model name. #### system ```python system: str ``` The model provider. ### OpenAIModel Bases: `OpenAIChatModel` Deprecated `OpenAIModel` was renamed to `OpenAIChatModel` to clearly distinguish it from `OpenAIResponsesModel` which uses OpenAI's newer Responses API. Use that unless you're using an OpenAI Chat Completions-compatible API, or require a feature that the Responses API doesn't support yet like audio. Deprecated alias for `OpenAIChatModel`. Source code in `pydantic_ai_slim/pydantic_ai/models/openai.py` ```python @deprecated( '`OpenAIModel` was renamed to `OpenAIChatModel` to clearly distinguish it from `OpenAIResponsesModel` which ' "uses OpenAI's newer Responses API. Use that unless you're using an OpenAI Chat Completions-compatible API, or " "require a feature that the Responses API doesn't support yet like audio." ) @dataclass(init=False) class OpenAIModel(OpenAIChatModel): """Deprecated alias for `OpenAIChatModel`.""" ``` ### OpenAIResponsesModel Bases: `Model` A model that uses the OpenAI Responses API. The [OpenAI Responses API](https://platform.openai.com/docs/api-reference/responses) is the new API for OpenAI models. If you are interested in the differences between the Responses API and the Chat Completions API, see the [OpenAI API docs](https://platform.openai.com/docs/guides/responses-vs-chat-completions). Source code in `pydantic_ai_slim/pydantic_ai/models/openai.py` ```python @dataclass(init=False) class OpenAIResponsesModel(Model): """A model that uses the OpenAI Responses API. The [OpenAI Responses API](https://platform.openai.com/docs/api-reference/responses) is the new API for OpenAI models. If you are interested in the differences between the Responses API and the Chat Completions API, see the [OpenAI API docs](https://platform.openai.com/docs/guides/responses-vs-chat-completions). """ client: AsyncOpenAI = field(repr=False) _model_name: OpenAIModelName = field(repr=False) _provider: Provider[AsyncOpenAI] = field(repr=False) def __init__( self, model_name: OpenAIModelName, *, provider: Literal['openai', 'deepseek', 'azure', 'openrouter', 'grok', 'fireworks', 'together'] | Provider[AsyncOpenAI] = 'openai', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize an OpenAI Responses model. Args: model_name: The name of the OpenAI model to use. provider: The provider to use. Defaults to `'openai'`. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. settings: Default model settings for this model instance. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider(provider) self._provider = provider self.client = provider.client super().__init__(settings=settings, profile=profile or provider.model_profile) @property def model_name(self) -> OpenAIModelName: """The model name.""" return self._model_name @property def system(self) -> str: """The model provider.""" return self._provider.name async def request( self, messages: list[ModelRequest | ModelResponse], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: check_allow_model_requests() response = await self._responses_create( messages, False, cast(OpenAIResponsesModelSettings, model_settings or {}), model_request_parameters ) return self._process_response(response) @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: check_allow_model_requests() response = await self._responses_create( messages, True, cast(OpenAIResponsesModelSettings, model_settings or {}), model_request_parameters ) async with response: yield await self._process_streamed_response(response, model_request_parameters) def _process_response(self, response: responses.Response) -> ModelResponse: """Process a non-streamed response, and prepare a message to return.""" timestamp = number_to_datetime(response.created_at) items: list[ModelResponsePart] = [] for item in response.output: if isinstance(item, responses.ResponseReasoningItem): signature = item.encrypted_content if item.summary: for summary in item.summary: # We use the same id for all summaries so that we can merge them on the round trip. items.append( ThinkingPart( content=summary.text, id=item.id, signature=signature, provider_name=self.system if signature else None, ) ) # We only need to store the signature once. signature = None elif signature: items.append( ThinkingPart( content='', id=item.id, signature=signature, provider_name=self.system, ) ) # NOTE: We don't currently handle the raw CoT from gpt-oss `reasoning_text`: https://cookbook.openai.com/articles/gpt-oss/handle-raw-cot # If you need this, please file an issue. elif isinstance(item, responses.ResponseOutputMessage): for content in item.content: if isinstance(content, responses.ResponseOutputText): # pragma: no branch items.append(TextPart(content.text, id=item.id)) elif isinstance(item, responses.ResponseFunctionToolCall): items.append( ToolCallPart(item.name, item.arguments, tool_call_id=_combine_tool_call_ids(item.call_id, item.id)) ) finish_reason: FinishReason | None = None provider_details: dict[str, Any] | None = None raw_finish_reason = details.reason if (details := response.incomplete_details) else response.status if raw_finish_reason: provider_details = {'finish_reason': raw_finish_reason} finish_reason = _RESPONSES_FINISH_REASON_MAP.get(raw_finish_reason) return ModelResponse( parts=items, usage=_map_usage(response), model_name=response.model, provider_response_id=response.id, timestamp=timestamp, provider_name=self._provider.name, finish_reason=finish_reason, provider_details=provider_details, ) async def _process_streamed_response( self, response: AsyncStream[responses.ResponseStreamEvent], model_request_parameters: ModelRequestParameters, ) -> OpenAIResponsesStreamedResponse: """Process a streamed response, and prepare a streaming response to return.""" peekable_response = _utils.PeekableAsyncStream(response) first_chunk = await peekable_response.peek() if isinstance(first_chunk, _utils.Unset): # pragma: no cover raise UnexpectedModelBehavior('Streamed response ended without content or tool calls') assert isinstance(first_chunk, responses.ResponseCreatedEvent) return OpenAIResponsesStreamedResponse( model_request_parameters=model_request_parameters, _model_name=first_chunk.response.model, _response=peekable_response, _timestamp=number_to_datetime(first_chunk.response.created_at), _provider_name=self._provider.name, ) @overload async def _responses_create( self, messages: list[ModelRequest | ModelResponse], stream: Literal[False], model_settings: OpenAIResponsesModelSettings, model_request_parameters: ModelRequestParameters, ) -> responses.Response: ... @overload async def _responses_create( self, messages: list[ModelRequest | ModelResponse], stream: Literal[True], model_settings: OpenAIResponsesModelSettings, model_request_parameters: ModelRequestParameters, ) -> AsyncStream[responses.ResponseStreamEvent]: ... async def _responses_create( self, messages: list[ModelRequest | ModelResponse], stream: bool, model_settings: OpenAIResponsesModelSettings, model_request_parameters: ModelRequestParameters, ) -> responses.Response | AsyncStream[responses.ResponseStreamEvent]: tools = ( self._get_builtin_tools(model_request_parameters) + list(model_settings.get('openai_builtin_tools', [])) + self._get_tools(model_request_parameters) ) if not tools: tool_choice: Literal['none', 'required', 'auto'] | None = None elif not model_request_parameters.allow_text_output: tool_choice = 'required' else: tool_choice = 'auto' previous_response_id = model_settings.get('openai_previous_response_id') if previous_response_id == 'auto': previous_response_id, messages = self._get_previous_response_id_and_new_messages(messages) instructions, openai_messages = await self._map_messages(messages, model_settings) reasoning = self._get_reasoning(model_settings) text: responses.ResponseTextConfigParam | None = None if model_request_parameters.output_mode == 'native': output_object = model_request_parameters.output_object assert output_object is not None text = {'format': self._map_json_schema(output_object)} elif ( model_request_parameters.output_mode == 'prompted' and self.profile.supports_json_object_output ): # pragma: no branch text = {'format': {'type': 'json_object'}} # Without this trick, we'd hit this error: # > Response input messages must contain the word 'json' in some form to use 'text.format' of type 'json_object'. # Apparently they're only checking input messages for "JSON", not instructions. assert isinstance(instructions, str) openai_messages.insert(0, responses.EasyInputMessageParam(role='system', content=instructions)) instructions = NOT_GIVEN if verbosity := model_settings.get('openai_text_verbosity'): text = text or {} text['verbosity'] = verbosity profile = OpenAIModelProfile.from_profile(self.profile) unsupported_model_settings = profile.openai_unsupported_model_settings for setting in unsupported_model_settings: model_settings.pop(setting, None) include: list[responses.ResponseIncludable] | None = None if profile.openai_supports_encrypted_reasoning_content: include = ['reasoning.encrypted_content'] try: extra_headers = model_settings.get('extra_headers', {}) extra_headers.setdefault('User-Agent', get_user_agent()) return await self.client.responses.create( input=openai_messages, model=self._model_name, instructions=instructions, parallel_tool_calls=model_settings.get('parallel_tool_calls', NOT_GIVEN), tools=tools or NOT_GIVEN, tool_choice=tool_choice or NOT_GIVEN, max_output_tokens=model_settings.get('max_tokens', NOT_GIVEN), stream=stream, temperature=model_settings.get('temperature', NOT_GIVEN), top_p=model_settings.get('top_p', NOT_GIVEN), truncation=model_settings.get('openai_truncation', NOT_GIVEN), timeout=model_settings.get('timeout', NOT_GIVEN), service_tier=model_settings.get('openai_service_tier', NOT_GIVEN), previous_response_id=previous_response_id, reasoning=reasoning, user=model_settings.get('openai_user', NOT_GIVEN), text=text or NOT_GIVEN, include=include or NOT_GIVEN, extra_headers=extra_headers, extra_body=model_settings.get('extra_body'), ) except APIStatusError as e: if (status_code := e.status_code) >= 400: raise ModelHTTPError(status_code=status_code, model_name=self.model_name, body=e.body) from e raise # pragma: lax no cover def _get_reasoning(self, model_settings: OpenAIResponsesModelSettings) -> Reasoning | NotGiven: reasoning_effort = model_settings.get('openai_reasoning_effort', None) reasoning_summary = model_settings.get('openai_reasoning_summary', None) reasoning_generate_summary = model_settings.get('openai_reasoning_generate_summary', None) if reasoning_summary and reasoning_generate_summary: # pragma: no cover raise ValueError('`openai_reasoning_summary` and `openai_reasoning_generate_summary` cannot both be set.') if reasoning_generate_summary is not None: # pragma: no cover warnings.warn( '`openai_reasoning_generate_summary` is deprecated, use `openai_reasoning_summary` instead', DeprecationWarning, ) reasoning_summary = reasoning_generate_summary if reasoning_effort is None and reasoning_summary is None: return NOT_GIVEN return Reasoning(effort=reasoning_effort, summary=reasoning_summary) def _get_tools(self, model_request_parameters: ModelRequestParameters) -> list[responses.FunctionToolParam]: return [self._map_tool_definition(r) for r in model_request_parameters.tool_defs.values()] def _get_builtin_tools(self, model_request_parameters: ModelRequestParameters) -> list[responses.ToolParam]: tools: list[responses.ToolParam] = [] for tool in model_request_parameters.builtin_tools: if isinstance(tool, WebSearchTool): web_search_tool = responses.WebSearchToolParam( type='web_search_preview', search_context_size=tool.search_context_size ) if tool.user_location: web_search_tool['user_location'] = responses.web_search_tool_param.UserLocation( type='approximate', **tool.user_location ) tools.append(web_search_tool) elif isinstance(tool, CodeExecutionTool): # pragma: no branch tools.append({'type': 'code_interpreter', 'container': {'type': 'auto'}}) else: raise UserError( # pragma: no cover f'`{tool.__class__.__name__}` is not supported by `OpenAIResponsesModel`. If it should be, please file an issue.' ) return tools def _map_tool_definition(self, f: ToolDefinition) -> responses.FunctionToolParam: return { 'name': f.name, 'parameters': f.parameters_json_schema, 'type': 'function', 'description': f.description, 'strict': bool( f.strict and OpenAIModelProfile.from_profile(self.profile).openai_supports_strict_tool_definition ), } def _get_previous_response_id_and_new_messages( self, messages: list[ModelMessage] ) -> tuple[str | None, list[ModelMessage]]: # When `openai_previous_response_id` is set to 'auto', the most recent # `provider_response_id` from the message history is selected and all # earlier messages are omitted. This allows the OpenAI SDK to reuse # server-side history for efficiency. The returned tuple contains the # `previous_response_id` (if found) and the trimmed list of messages. previous_response_id = None trimmed_messages: list[ModelMessage] = [] for m in reversed(messages): if isinstance(m, ModelResponse) and m.provider_name == self.system: previous_response_id = m.provider_response_id break else: trimmed_messages.append(m) if previous_response_id and trimmed_messages: return previous_response_id, list(reversed(trimmed_messages)) else: return None, messages async def _map_messages( # noqa: C901 self, messages: list[ModelMessage], model_settings: OpenAIResponsesModelSettings ) -> tuple[str | NotGiven, list[responses.ResponseInputItemParam]]: """Just maps a `pydantic_ai.Message` to a `openai.types.responses.ResponseInputParam`.""" openai_messages: list[responses.ResponseInputItemParam] = [] for message in messages: if isinstance(message, ModelRequest): for part in message.parts: if isinstance(part, SystemPromptPart): openai_messages.append(responses.EasyInputMessageParam(role='system', content=part.content)) elif isinstance(part, UserPromptPart): openai_messages.append(await self._map_user_prompt(part)) elif isinstance(part, ToolReturnPart): call_id = _guard_tool_call_id(t=part) call_id, _ = _split_combined_tool_call_id(call_id) item = FunctionCallOutput( type='function_call_output', call_id=call_id, output=part.model_response_str(), ) openai_messages.append(item) elif isinstance(part, RetryPromptPart): # TODO(Marcelo): How do we test this conditional branch? if part.tool_name is None: # pragma: no cover openai_messages.append( Message(role='user', content=[{'type': 'input_text', 'text': part.model_response()}]) ) else: call_id = _guard_tool_call_id(t=part) call_id, _ = _split_combined_tool_call_id(call_id) item = FunctionCallOutput( type='function_call_output', call_id=call_id, output=part.model_response(), ) openai_messages.append(item) else: assert_never(part) elif isinstance(message, ModelResponse): message_item: responses.ResponseOutputMessageParam | None = None reasoning_item: responses.ResponseReasoningItemParam | None = None for item in message.parts: if isinstance(item, TextPart): if item.id and message.provider_name == self.system: if message_item is None or message_item['id'] != item.id: # pragma: no branch message_item = responses.ResponseOutputMessageParam( role='assistant', id=item.id or _utils.generate_tool_call_id(), content=[], type='message', status='completed', ) openai_messages.append(message_item) message_item['content'] = [ *message_item['content'], responses.ResponseOutputTextParam( text=item.content, type='output_text', annotations=[] ), ] else: openai_messages.append( responses.EasyInputMessageParam(role='assistant', content=item.content) ) elif isinstance(item, ToolCallPart): openai_messages.append(self._map_tool_call(item)) elif isinstance(item, BuiltinToolCallPart | BuiltinToolReturnPart): # We don't currently track built-in tool calls from OpenAI pass elif isinstance(item, ThinkingPart): if ( item.id and message.provider_name == self.system and model_settings.get('openai_send_reasoning_ids', True) ): signature: str | None = None if ( item.signature and item.provider_name == self.system and OpenAIModelProfile.from_profile( self.profile ).openai_supports_encrypted_reasoning_content ): signature = item.signature if (reasoning_item is None or reasoning_item['id'] != item.id) and ( signature or item.content ): # pragma: no branch reasoning_item = responses.ResponseReasoningItemParam( id=item.id, summary=[], encrypted_content=signature, type='reasoning', ) openai_messages.append(reasoning_item) if item.content: # The check above guarantees that `reasoning_item` is not None assert reasoning_item is not None reasoning_item['summary'] = [ *reasoning_item['summary'], Summary(text=item.content, type='summary_text'), ] else: start_tag, end_tag = self.profile.thinking_tags openai_messages.append( responses.EasyInputMessageParam( role='assistant', content='\n'.join([start_tag, item.content, end_tag]) ) ) else: assert_never(item) else: assert_never(message) instructions = self._get_instructions(messages) or NOT_GIVEN return instructions, openai_messages @staticmethod def _map_tool_call(t: ToolCallPart) -> responses.ResponseFunctionToolCallParam: call_id = _guard_tool_call_id(t=t) call_id, id = _split_combined_tool_call_id(call_id) param = responses.ResponseFunctionToolCallParam( name=t.tool_name, arguments=t.args_as_json_str(), call_id=call_id, type='function_call', ) if id: # pragma: no branch param['id'] = id return param def _map_json_schema(self, o: OutputObjectDefinition) -> responses.ResponseFormatTextJSONSchemaConfigParam: response_format_param: responses.ResponseFormatTextJSONSchemaConfigParam = { 'type': 'json_schema', 'name': o.name or DEFAULT_OUTPUT_TOOL_NAME, 'schema': o.json_schema, } if o.description: response_format_param['description'] = o.description if OpenAIModelProfile.from_profile(self.profile).openai_supports_strict_tool_definition: # pragma: no branch response_format_param['strict'] = o.strict return response_format_param @staticmethod async def _map_user_prompt(part: UserPromptPart) -> responses.EasyInputMessageParam: content: str | list[responses.ResponseInputContentParam] if isinstance(part.content, str): content = part.content else: content = [] for item in part.content: if isinstance(item, str): content.append(responses.ResponseInputTextParam(text=item, type='input_text')) elif isinstance(item, BinaryContent): base64_encoded = base64.b64encode(item.data).decode('utf-8') if item.is_image: content.append( responses.ResponseInputImageParam( image_url=f'data:{item.media_type};base64,{base64_encoded}', type='input_image', detail='auto', ) ) elif item.is_document: content.append( responses.ResponseInputFileParam( type='input_file', file_data=f'data:{item.media_type};base64,{base64_encoded}', # NOTE: Type wise it's not necessary to include the filename, but it's required by the # API itself. If we add empty string, the server sends a 500 error - which OpenAI needs # to fix. In any case, we add a placeholder name. filename=f'filename.{item.format}', ) ) elif item.is_audio: raise NotImplementedError('Audio as binary content is not supported for OpenAI Responses API.') else: # pragma: no cover raise RuntimeError(f'Unsupported binary content type: {item.media_type}') elif isinstance(item, ImageUrl): content.append( responses.ResponseInputImageParam(image_url=item.url, type='input_image', detail='auto') ) elif isinstance(item, AudioUrl): # pragma: no cover downloaded_item = await download_item(item, data_format='base64_uri', type_format='extension') content.append( responses.ResponseInputFileParam( type='input_file', file_data=downloaded_item['data'], filename=f'filename.{downloaded_item["data_type"]}', ) ) elif isinstance(item, DocumentUrl): downloaded_item = await download_item(item, data_format='base64_uri', type_format='extension') content.append( responses.ResponseInputFileParam( type='input_file', file_data=downloaded_item['data'], filename=f'filename.{downloaded_item["data_type"]}', ) ) elif isinstance(item, VideoUrl): # pragma: no cover raise NotImplementedError('VideoUrl is not supported for OpenAI.') else: assert_never(item) return responses.EasyInputMessageParam(role='user', content=content) ``` #### __init__ ```python __init__( model_name: OpenAIModelName, *, provider: ( Literal[ "openai", "deepseek", "azure", "openrouter", "grok", "fireworks", "together", ] | Provider[AsyncOpenAI] ) = "openai", profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None ) ``` Initialize an OpenAI Responses model. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `model_name` | `OpenAIModelName` | The name of the OpenAI model to use. | *required* | | `provider` | `Literal['openai', 'deepseek', 'azure', 'openrouter', 'grok', 'fireworks', 'together'] | Provider[AsyncOpenAI]` | The provider to use. Defaults to 'openai'. | `'openai'` | | `profile` | `ModelProfileSpec | None` | The model profile to use. Defaults to a profile picked by the provider based on the model name. | `None` | | `settings` | `ModelSettings | None` | Default model settings for this model instance. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/models/openai.py` ```python def __init__( self, model_name: OpenAIModelName, *, provider: Literal['openai', 'deepseek', 'azure', 'openrouter', 'grok', 'fireworks', 'together'] | Provider[AsyncOpenAI] = 'openai', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize an OpenAI Responses model. Args: model_name: The name of the OpenAI model to use. provider: The provider to use. Defaults to `'openai'`. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. settings: Default model settings for this model instance. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider(provider) self._provider = provider self.client = provider.client super().__init__(settings=settings, profile=profile or provider.model_profile) ``` #### model_name ```python model_name: OpenAIModelName ``` The model name. #### system ```python system: str ``` The model provider. # `pydantic_ai.models.test` Utility model for quickly testing apps built with Pydantic AI. Here's a minimal example: test_model_usage.py ```py from pydantic_ai import Agent from pydantic_ai.models.test import TestModel my_agent = Agent('openai:gpt-4o', system_prompt='...') async def test_my_agent(): """Unit test for my_agent, to be run by pytest.""" m = TestModel() with my_agent.override(model=m): result = await my_agent.run('Testing my agent...') assert result.output == 'success (no tool calls)' assert m.last_model_request_parameters.function_tools == [] ``` See [Unit testing with `TestModel`](../../../testing/#unit-testing-with-testmodel) for detailed documentation. ### TestModel Bases: `Model` A model specifically for testing purposes. This will (by default) call all tools in the agent, then return a tool response if possible, otherwise a plain response. How useful this model is will vary significantly. Apart from `__init__` derived by the `dataclass` decorator, all methods are private or match those of the base class. Source code in `pydantic_ai_slim/pydantic_ai/models/test.py` ```python @dataclass(init=False) class TestModel(Model): """A model specifically for testing purposes. This will (by default) call all tools in the agent, then return a tool response if possible, otherwise a plain response. How useful this model is will vary significantly. Apart from `__init__` derived by the `dataclass` decorator, all methods are private or match those of the base class. """ # NOTE: Avoid test discovery by pytest. __test__ = False call_tools: list[str] | Literal['all'] = 'all' """List of tools to call. If `'all'`, all tools will be called.""" custom_output_text: str | None = None """If set, this text is returned as the final output.""" custom_output_args: Any | None = None """If set, these args will be passed to the output tool.""" seed: int = 0 """Seed for generating random data.""" last_model_request_parameters: ModelRequestParameters | None = field(default=None, init=False) """The last ModelRequestParameters passed to the model in a request. The ModelRequestParameters contains information about the function and output tools available during request handling. This is set when a request is made, so will reflect the function tools from the last step of the last run. """ _model_name: str = field(default='test', repr=False) _system: str = field(default='test', repr=False) def __init__( self, *, call_tools: list[str] | Literal['all'] = 'all', custom_output_text: str | None = None, custom_output_args: Any | None = None, seed: int = 0, profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize TestModel with optional settings and profile.""" self.call_tools = call_tools self.custom_output_text = custom_output_text self.custom_output_args = custom_output_args self.seed = seed self.last_model_request_parameters = None self._model_name = 'test' self._system = 'test' super().__init__(settings=settings, profile=profile) async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: self.last_model_request_parameters = model_request_parameters model_response = self._request(messages, model_settings, model_request_parameters) model_response.usage = _estimate_usage([*messages, model_response]) return model_response @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: self.last_model_request_parameters = model_request_parameters model_response = self._request(messages, model_settings, model_request_parameters) yield TestStreamedResponse( model_request_parameters=model_request_parameters, _model_name=self._model_name, _structured_response=model_response, _messages=messages, _provider_name=self._system, ) @property def model_name(self) -> str: """The model name.""" return self._model_name @property def system(self) -> str: """The model provider.""" return self._system def gen_tool_args(self, tool_def: ToolDefinition) -> Any: return _JsonSchemaTestData(tool_def.parameters_json_schema, self.seed).generate() def _get_tool_calls(self, model_request_parameters: ModelRequestParameters) -> list[tuple[str, ToolDefinition]]: if self.call_tools == 'all': return [(r.name, r) for r in model_request_parameters.function_tools] else: function_tools_lookup = {t.name: t for t in model_request_parameters.function_tools} tools_to_call = (function_tools_lookup[name] for name in self.call_tools) return [(r.name, r) for r in tools_to_call] def _get_output(self, model_request_parameters: ModelRequestParameters) -> _WrappedTextOutput | _WrappedToolOutput: if self.custom_output_text is not None: assert model_request_parameters.output_mode != 'tool', ( 'Plain response not allowed, but `custom_output_text` is set.' ) assert self.custom_output_args is None, 'Cannot set both `custom_output_text` and `custom_output_args`.' return _WrappedTextOutput(self.custom_output_text) elif self.custom_output_args is not None: assert model_request_parameters.output_tools is not None, ( 'No output tools provided, but `custom_output_args` is set.' ) output_tool = model_request_parameters.output_tools[0] if k := output_tool.outer_typed_dict_key: return _WrappedToolOutput({k: self.custom_output_args}) else: return _WrappedToolOutput(self.custom_output_args) elif model_request_parameters.allow_text_output: return _WrappedTextOutput(None) elif model_request_parameters.output_tools: return _WrappedToolOutput(None) else: return _WrappedTextOutput(None) def _request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: if model_request_parameters.builtin_tools: raise UserError('TestModel does not support built-in tools') tool_calls = self._get_tool_calls(model_request_parameters) output_wrapper = self._get_output(model_request_parameters) output_tools = model_request_parameters.output_tools # if there are tools, the first thing we want to do is call all of them if tool_calls and not any(isinstance(m, ModelResponse) for m in messages): return ModelResponse( parts=[ ToolCallPart(name, self.gen_tool_args(args), tool_call_id=f'pyd_ai_tool_call_id__{name}') for name, args in tool_calls ], model_name=self._model_name, ) if messages: # pragma: no branch last_message = messages[-1] assert isinstance(last_message, ModelRequest), 'Expected last message to be a `ModelRequest`.' # check if there are any retry prompts, if so retry them new_retry_names = {p.tool_name for p in last_message.parts if isinstance(p, RetryPromptPart)} if new_retry_names: # Handle retries for both function tools and output tools # Check function tools first retry_parts: list[ModelResponsePart] = [ ToolCallPart(name, self.gen_tool_args(args)) for name, args in tool_calls if name in new_retry_names ] # Check output tools if output_tools: retry_parts.extend( [ ToolCallPart( tool.name, output_wrapper.value if isinstance(output_wrapper, _WrappedToolOutput) and output_wrapper.value is not None else self.gen_tool_args(tool), tool_call_id=f'pyd_ai_tool_call_id__{tool.name}', ) for tool in output_tools if tool.name in new_retry_names ] ) return ModelResponse(parts=retry_parts, model_name=self._model_name) if isinstance(output_wrapper, _WrappedTextOutput): if (response_text := output_wrapper.value) is None: # build up details of tool responses output: dict[str, Any] = {} for message in messages: if isinstance(message, ModelRequest): for part in message.parts: if isinstance(part, ToolReturnPart): output[part.tool_name] = part.content if output: return ModelResponse( parts=[TextPart(pydantic_core.to_json(output).decode())], model_name=self._model_name ) else: return ModelResponse(parts=[TextPart('success (no tool calls)')], model_name=self._model_name) else: return ModelResponse(parts=[TextPart(response_text)], model_name=self._model_name) else: assert output_tools, 'No output tools provided' custom_output_args = output_wrapper.value output_tool = output_tools[self.seed % len(output_tools)] if custom_output_args is not None: return ModelResponse( parts=[ ToolCallPart( output_tool.name, custom_output_args, tool_call_id=f'pyd_ai_tool_call_id__{output_tool.name}', ) ], model_name=self._model_name, ) else: response_args = self.gen_tool_args(output_tool) return ModelResponse( parts=[ ToolCallPart( output_tool.name, response_args, tool_call_id=f'pyd_ai_tool_call_id__{output_tool.name}', ) ], model_name=self._model_name, ) ``` #### __init__ ```python __init__( *, call_tools: list[str] | Literal["all"] = "all", custom_output_text: str | None = None, custom_output_args: Any | None = None, seed: int = 0, profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None ) ``` Initialize TestModel with optional settings and profile. Source code in `pydantic_ai_slim/pydantic_ai/models/test.py` ```python def __init__( self, *, call_tools: list[str] | Literal['all'] = 'all', custom_output_text: str | None = None, custom_output_args: Any | None = None, seed: int = 0, profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize TestModel with optional settings and profile.""" self.call_tools = call_tools self.custom_output_text = custom_output_text self.custom_output_args = custom_output_args self.seed = seed self.last_model_request_parameters = None self._model_name = 'test' self._system = 'test' super().__init__(settings=settings, profile=profile) ``` #### call_tools ```python call_tools: list[str] | Literal['all'] = call_tools ``` List of tools to call. If `'all'`, all tools will be called. #### custom_output_text ```python custom_output_text: str | None = custom_output_text ``` If set, this text is returned as the final output. #### custom_output_args ```python custom_output_args: Any | None = custom_output_args ``` If set, these args will be passed to the output tool. #### seed ```python seed: int = seed ``` Seed for generating random data. #### last_model_request_parameters ```python last_model_request_parameters: ( ModelRequestParameters | None ) = None ``` The last ModelRequestParameters passed to the model in a request. The ModelRequestParameters contains information about the function and output tools available during request handling. This is set when a request is made, so will reflect the function tools from the last step of the last run. #### model_name ```python model_name: str ``` The model name. #### system ```python system: str ``` The model provider. ### TestStreamedResponse Bases: `StreamedResponse` A structured response that streams test data. Source code in `pydantic_ai_slim/pydantic_ai/models/test.py` ```python @dataclass class TestStreamedResponse(StreamedResponse): """A structured response that streams test data.""" _model_name: str _structured_response: ModelResponse _messages: InitVar[Iterable[ModelMessage]] _provider_name: str _timestamp: datetime = field(default_factory=_utils.now_utc, init=False) def __post_init__(self, _messages: Iterable[ModelMessage]): self._usage = _estimate_usage(_messages) async def _get_event_iterator(self) -> AsyncIterator[ModelResponseStreamEvent]: for i, part in enumerate(self._structured_response.parts): if isinstance(part, TextPart): text = part.content *words, last_word = text.split(' ') words = [f'{word} ' for word in words] words.append(last_word) if len(words) == 1 and len(text) > 2: mid = len(text) // 2 words = [text[:mid], text[mid:]] self._usage += _get_string_usage('') maybe_event = self._parts_manager.handle_text_delta(vendor_part_id=i, content='') if maybe_event is not None: # pragma: no branch yield maybe_event for word in words: self._usage += _get_string_usage(word) maybe_event = self._parts_manager.handle_text_delta(vendor_part_id=i, content=word) if maybe_event is not None: # pragma: no branch yield maybe_event elif isinstance(part, ToolCallPart): yield self._parts_manager.handle_tool_call_part( vendor_part_id=i, tool_name=part.tool_name, args=part.args, tool_call_id=part.tool_call_id ) elif isinstance(part, BuiltinToolCallPart | BuiltinToolReturnPart): # pragma: no cover # NOTE: These parts are not generated by TestModel, but we need to handle them for type checking assert False, f'Unexpected part type in TestModel: {type(part).__name__}' elif isinstance(part, ThinkingPart): # pragma: no cover # NOTE: There's no way to reach this part of the code, since we don't generate ThinkingPart on TestModel. assert False, "This should be unreachable — we don't generate ThinkingPart on TestModel." else: assert_never(part) @property def model_name(self) -> str: """Get the model name of the response.""" return self._model_name @property def provider_name(self) -> str: """Get the provider name.""" return self._provider_name @property def timestamp(self) -> datetime: """Get the timestamp of the response.""" return self._timestamp ``` #### model_name ```python model_name: str ``` Get the model name of the response. #### provider_name ```python provider_name: str ``` Get the provider name. #### timestamp ```python timestamp: datetime ``` Get the timestamp of the response. # pydantic_ai.models.wrapper ### WrapperModel Bases: `Model` Model which wraps another model. Does nothing on its own, used as a base class. Source code in `pydantic_ai_slim/pydantic_ai/models/wrapper.py` ```python @dataclass(init=False) class WrapperModel(Model): """Model which wraps another model. Does nothing on its own, used as a base class. """ wrapped: Model """The underlying model being wrapped.""" def __init__(self, wrapped: Model | KnownModelName): super().__init__() self.wrapped = infer_model(wrapped) async def request(self, *args: Any, **kwargs: Any) -> ModelResponse: return await self.wrapped.request(*args, **kwargs) @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: async with self.wrapped.request_stream( messages, model_settings, model_request_parameters, run_context ) as response_stream: yield response_stream def customize_request_parameters(self, model_request_parameters: ModelRequestParameters) -> ModelRequestParameters: return self.wrapped.customize_request_parameters(model_request_parameters) @property def model_name(self) -> str: return self.wrapped.model_name @property def system(self) -> str: return self.wrapped.system @cached_property def profile(self) -> ModelProfile: return self.wrapped.profile @property def settings(self) -> ModelSettings | None: """Get the settings from the wrapped model.""" return self.wrapped.settings def __getattr__(self, item: str): return getattr(self.wrapped, item) ``` #### wrapped ```python wrapped: Model = infer_model(wrapped) ``` The underlying model being wrapped. #### settings ```python settings: ModelSettings | None ``` Get the settings from the wrapped model. # `pydantic_evals.dataset` Dataset management for pydantic evals. This module provides functionality for creating, loading, saving, and evaluating datasets of test cases. Each case must have inputs, and can optionally have a name, expected output, metadata, and case-specific evaluators. Datasets can be loaded from and saved to YAML or JSON files, and can be evaluated against a task function to produce an evaluation report. ### Case Bases: `Generic[InputsT, OutputT, MetadataT]` A single row of a Dataset. Each case represents a single test scenario with inputs to test. A case may optionally specify a name, expected outputs to compare against, and arbitrary metadata. Cases can also have their own specific evaluators which are run in addition to dataset-level evaluators. Example: ```python from pydantic_evals import Case case = Case( name='Simple addition', inputs={'a': 1, 'b': 2}, expected_output=3, metadata={'description': 'Tests basic addition'}, ) ``` Source code in `pydantic_evals/pydantic_evals/dataset.py` ````python @dataclass(init=False) class Case(Generic[InputsT, OutputT, MetadataT]): """A single row of a [`Dataset`][pydantic_evals.Dataset]. Each case represents a single test scenario with inputs to test. A case may optionally specify a name, expected outputs to compare against, and arbitrary metadata. Cases can also have their own specific evaluators which are run in addition to dataset-level evaluators. Example: ```python from pydantic_evals import Case case = Case( name='Simple addition', inputs={'a': 1, 'b': 2}, expected_output=3, metadata={'description': 'Tests basic addition'}, ) ``` """ name: str | None """Name of the case. This is used to identify the case in the report and can be used to filter cases.""" inputs: InputsT """Inputs to the task. This is the input to the task that will be evaluated.""" metadata: MetadataT | None = None """Metadata to be used in the evaluation. This can be used to provide additional information about the case to the evaluators. """ expected_output: OutputT | None = None """Expected output of the task. This is the expected output of the task that will be evaluated.""" evaluators: list[Evaluator[InputsT, OutputT, MetadataT]] = field(default_factory=list) """Evaluators to be used just on this case.""" def __init__( self, *, name: str | None = None, inputs: InputsT, metadata: MetadataT | None = None, expected_output: OutputT | None = None, evaluators: tuple[Evaluator[InputsT, OutputT, MetadataT], ...] = (), ): """Initialize a new test case. Args: name: Optional name for the case. If not provided, a generic name will be assigned when added to a dataset. inputs: The inputs to the task being evaluated. metadata: Optional metadata for the case, which can be used by evaluators. expected_output: Optional expected output of the task, used for comparison in evaluators. evaluators: Tuple of evaluators specific to this case. These are in addition to any dataset-level evaluators. """ # Note: `evaluators` must be a tuple instead of Sequence due to misbehavior with pyright's generic parameter # inference if it has type `Sequence` self.name = name self.inputs = inputs self.metadata = metadata self.expected_output = expected_output self.evaluators = list(evaluators) ```` #### __init__ ```python __init__( *, name: str | None = None, inputs: InputsT, metadata: MetadataT | None = None, expected_output: OutputT | None = None, evaluators: tuple[ Evaluator[InputsT, OutputT, MetadataT], ... ] = () ) ``` Initialize a new test case. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `name` | `str | None` | Optional name for the case. If not provided, a generic name will be assigned when added to a dataset. | `None` | | `inputs` | `InputsT` | The inputs to the task being evaluated. | *required* | | `metadata` | `MetadataT | None` | Optional metadata for the case, which can be used by evaluators. | `None` | | `expected_output` | `OutputT | None` | Optional expected output of the task, used for comparison in evaluators. | `None` | | `evaluators` | `tuple[Evaluator[InputsT, OutputT, MetadataT], ...]` | Tuple of evaluators specific to this case. These are in addition to any dataset-level evaluators. | `()` | Source code in `pydantic_evals/pydantic_evals/dataset.py` ```python def __init__( self, *, name: str | None = None, inputs: InputsT, metadata: MetadataT | None = None, expected_output: OutputT | None = None, evaluators: tuple[Evaluator[InputsT, OutputT, MetadataT], ...] = (), ): """Initialize a new test case. Args: name: Optional name for the case. If not provided, a generic name will be assigned when added to a dataset. inputs: The inputs to the task being evaluated. metadata: Optional metadata for the case, which can be used by evaluators. expected_output: Optional expected output of the task, used for comparison in evaluators. evaluators: Tuple of evaluators specific to this case. These are in addition to any dataset-level evaluators. """ # Note: `evaluators` must be a tuple instead of Sequence due to misbehavior with pyright's generic parameter # inference if it has type `Sequence` self.name = name self.inputs = inputs self.metadata = metadata self.expected_output = expected_output self.evaluators = list(evaluators) ``` #### name ```python name: str | None = name ``` Name of the case. This is used to identify the case in the report and can be used to filter cases. #### inputs ```python inputs: InputsT = inputs ``` Inputs to the task. This is the input to the task that will be evaluated. #### metadata ```python metadata: MetadataT | None = metadata ``` Metadata to be used in the evaluation. This can be used to provide additional information about the case to the evaluators. #### expected_output ```python expected_output: OutputT | None = expected_output ``` Expected output of the task. This is the expected output of the task that will be evaluated. #### evaluators ```python evaluators: list[Evaluator[InputsT, OutputT, MetadataT]] = ( list(evaluators) ) ``` Evaluators to be used just on this case. ### Dataset Bases: `BaseModel`, `Generic[InputsT, OutputT, MetadataT]` A dataset of test cases. Datasets allow you to organize a collection of test cases and evaluate them against a task function. They can be loaded from and saved to YAML or JSON files, and can have dataset-level evaluators that apply to all cases. Example: ```python # Create a dataset with two test cases from dataclasses import dataclass from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class ExactMatch(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> bool: return ctx.output == ctx.expected_output dataset = Dataset( cases=[ Case(name='test1', inputs={'text': 'Hello'}, expected_output='HELLO'), Case(name='test2', inputs={'text': 'World'}, expected_output='WORLD'), ], evaluators=[ExactMatch()], ) # Evaluate the dataset against a task function async def uppercase(inputs: dict) -> str: return inputs['text'].upper() async def main(): report = await dataset.evaluate(uppercase) report.print() ''' Evaluation Summary: uppercase ┏━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━┓ ┃ Case ID ┃ Assertions ┃ Duration ┃ ┡━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━┩ │ test1 │ ✔ │ 10ms │ ├──────────┼────────────┼──────────┤ │ test2 │ ✔ │ 10ms │ ├──────────┼────────────┼──────────┤ │ Averages │ 100.0% ✔ │ 10ms │ └──────────┴────────────┴──────────┘ ''' ``` Source code in `pydantic_evals/pydantic_evals/dataset.py` ````python class Dataset(BaseModel, Generic[InputsT, OutputT, MetadataT], extra='forbid', arbitrary_types_allowed=True): """A dataset of test [cases][pydantic_evals.Case]. Datasets allow you to organize a collection of test cases and evaluate them against a task function. They can be loaded from and saved to YAML or JSON files, and can have dataset-level evaluators that apply to all cases. Example: ```python # Create a dataset with two test cases from dataclasses import dataclass from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class ExactMatch(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> bool: return ctx.output == ctx.expected_output dataset = Dataset( cases=[ Case(name='test1', inputs={'text': 'Hello'}, expected_output='HELLO'), Case(name='test2', inputs={'text': 'World'}, expected_output='WORLD'), ], evaluators=[ExactMatch()], ) # Evaluate the dataset against a task function async def uppercase(inputs: dict) -> str: return inputs['text'].upper() async def main(): report = await dataset.evaluate(uppercase) report.print() ''' Evaluation Summary: uppercase ┏━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━┓ ┃ Case ID ┃ Assertions ┃ Duration ┃ ┡━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━┩ │ test1 │ ✔ │ 10ms │ ├──────────┼────────────┼──────────┤ │ test2 │ ✔ │ 10ms │ ├──────────┼────────────┼──────────┤ │ Averages │ 100.0% ✔ │ 10ms │ └──────────┴────────────┴──────────┘ ''' ``` """ cases: list[Case[InputsT, OutputT, MetadataT]] """List of test cases in the dataset.""" evaluators: list[Evaluator[InputsT, OutputT, MetadataT]] = [] """List of evaluators to be used on all cases in the dataset.""" def __init__( self, *, cases: Sequence[Case[InputsT, OutputT, MetadataT]], evaluators: Sequence[Evaluator[InputsT, OutputT, MetadataT]] = (), ): """Initialize a new dataset with test cases and optional evaluators. Args: cases: Sequence of test cases to include in the dataset. evaluators: Optional sequence of evaluators to apply to all cases in the dataset. """ case_names = set[str]() for case in cases: if case.name is None: continue if case.name in case_names: raise ValueError(f'Duplicate case name: {case.name!r}') case_names.add(case.name) super().__init__( cases=cases, evaluators=list(evaluators), ) async def evaluate( self, task: Callable[[InputsT], Awaitable[OutputT]] | Callable[[InputsT], OutputT], name: str | None = None, max_concurrency: int | None = None, progress: bool = True, retry_task: RetryConfig | None = None, retry_evaluators: RetryConfig | None = None, ) -> EvaluationReport[InputsT, OutputT, MetadataT]: """Evaluates the test cases in the dataset using the given task. This method runs the task on each case in the dataset, applies evaluators, and collects results into a report. Cases are run concurrently, limited by `max_concurrency` if specified. Args: task: The task to evaluate. This should be a callable that takes the inputs of the case and returns the output. name: The name of the task being evaluated, this is used to identify the task in the report. If omitted, the name of the task function will be used. max_concurrency: The maximum number of concurrent evaluations of the task to allow. If None, all cases will be evaluated concurrently. progress: Whether to show a progress bar for the evaluation. Defaults to `True`. retry_task: Optional retry configuration for the task execution. retry_evaluators: Optional retry configuration for evaluator execution. Returns: A report containing the results of the evaluation. """ name = name or get_unwrapped_function_name(task) total_cases = len(self.cases) progress_bar = Progress() if progress else None limiter = anyio.Semaphore(max_concurrency) if max_concurrency is not None else AsyncExitStack() with ( logfire_span('evaluate {name}', name=name, n_cases=len(self.cases)) as eval_span, progress_bar or nullcontext(), ): task_id = progress_bar.add_task(f'Evaluating {name}', total=total_cases) if progress_bar else None async def _handle_case(case: Case[InputsT, OutputT, MetadataT], report_case_name: str): async with limiter: result = await _run_task_and_evaluators( task, case, report_case_name, self.evaluators, retry_task, retry_evaluators ) if progress_bar and task_id is not None: # pragma: no branch progress_bar.update(task_id, advance=1) return result if (context := eval_span.context) is None: # pragma: no cover trace_id = None span_id = None else: trace_id = f'{context.trace_id:032x}' span_id = f'{context.span_id:016x}' cases_and_failures = await task_group_gather( [ lambda case=case, i=i: _handle_case(case, case.name or f'Case {i}') for i, case in enumerate(self.cases, 1) ] ) cases: list[ReportCase] = [] failures: list[ReportCaseFailure] = [] for item in cases_and_failures: if isinstance(item, ReportCase): cases.append(item) else: failures.append(item) report = EvaluationReport( name=name, cases=cases, failures=failures, span_id=span_id, trace_id=trace_id, ) if (averages := report.averages()) is not None and averages.assertions is not None: eval_span.set_attribute('assertion_pass_rate', averages.assertions) return report def evaluate_sync( self, task: Callable[[InputsT], Awaitable[OutputT]] | Callable[[InputsT], OutputT], name: str | None = None, max_concurrency: int | None = None, progress: bool = True, ) -> EvaluationReport[InputsT, OutputT, MetadataT]: """Evaluates the test cases in the dataset using the given task. This is a synchronous wrapper around [`evaluate`][pydantic_evals.Dataset.evaluate] provided for convenience. Args: task: The task to evaluate. This should be a callable that takes the inputs of the case and returns the output. name: The name of the task being evaluated, this is used to identify the task in the report. If omitted, the name of the task function will be used. max_concurrency: The maximum number of concurrent evaluations of the task to allow. If None, all cases will be evaluated concurrently. progress: Whether to show a progress bar for the evaluation. Defaults to True. Returns: A report containing the results of the evaluation. """ return get_event_loop().run_until_complete( self.evaluate(task, name=name, max_concurrency=max_concurrency, progress=progress) ) def add_case( self, *, name: str | None = None, inputs: InputsT, metadata: MetadataT | None = None, expected_output: OutputT | None = None, evaluators: tuple[Evaluator[InputsT, OutputT, MetadataT], ...] = (), ) -> None: """Adds a case to the dataset. This is a convenience method for creating a [`Case`][pydantic_evals.Case] and adding it to the dataset. Args: name: Optional name for the case. If not provided, a generic name will be assigned. inputs: The inputs to the task being evaluated. metadata: Optional metadata for the case, which can be used by evaluators. expected_output: The expected output of the task, used for comparison in evaluators. evaluators: Tuple of evaluators specific to this case, in addition to dataset-level evaluators. """ if name in {case.name for case in self.cases}: raise ValueError(f'Duplicate case name: {name!r}') case = Case[InputsT, OutputT, MetadataT]( name=name, inputs=inputs, metadata=metadata, expected_output=expected_output, evaluators=evaluators, ) self.cases.append(case) def add_evaluator( self, evaluator: Evaluator[InputsT, OutputT, MetadataT], specific_case: str | None = None, ) -> None: """Adds an evaluator to the dataset or a specific case. Args: evaluator: The evaluator to add. specific_case: If provided, the evaluator will only be added to the case with this name. If None, the evaluator will be added to all cases in the dataset. Raises: ValueError: If `specific_case` is provided but no case with that name exists in the dataset. """ if specific_case is None: self.evaluators.append(evaluator) else: # If this is too slow, we could try to add a case lookup dict. # Note that if we do that, we'd need to make the cases list private to prevent modification. added = False for case in self.cases: if case.name == specific_case: case.evaluators.append(evaluator) added = True if not added: raise ValueError(f'Case {specific_case!r} not found in the dataset') @classmethod @functools.cache def _params(cls) -> tuple[type[InputsT], type[OutputT], type[MetadataT]]: """Get the type parameters for the Dataset class. Returns: A tuple of (InputsT, OutputT, MetadataT) types. """ for c in cls.__mro__: metadata = getattr(c, '__pydantic_generic_metadata__', {}) if len(args := (metadata.get('args', ()) or getattr(c, '__args__', ()))) == 3: # pragma: no branch return args else: # pragma: no cover warnings.warn( f'Could not determine the generic parameters for {cls}; using `Any` for each.' f' You should explicitly set the generic parameters via `Dataset[MyInputs, MyOutput, MyMetadata]`' f' when serializing or deserializing.', UserWarning, ) return Any, Any, Any # type: ignore @classmethod def from_file( cls, path: Path | str, fmt: Literal['yaml', 'json'] | None = None, custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = (), ) -> Self: """Load a dataset from a file. Args: path: Path to the file to load. fmt: Format of the file. If None, the format will be inferred from the file extension. Must be either 'yaml' or 'json'. custom_evaluator_types: Custom evaluator classes to use when deserializing the dataset. These are additional evaluators beyond the default ones. Returns: A new Dataset instance loaded from the file. Raises: ValidationError: If the file cannot be parsed as a valid dataset. ValueError: If the format cannot be inferred from the file extension. """ path = Path(path) fmt = cls._infer_fmt(path, fmt) raw = Path(path).read_text() try: return cls.from_text(raw, fmt=fmt, custom_evaluator_types=custom_evaluator_types) except ValidationError as e: # pragma: no cover raise ValueError(f'{path} contains data that does not match the schema for {cls.__name__}:\n{e}.') from e @classmethod def from_text( cls, contents: str, fmt: Literal['yaml', 'json'] = 'yaml', custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = (), ) -> Self: """Load a dataset from a string. Args: contents: The string content to parse. fmt: Format of the content. Must be either 'yaml' or 'json'. custom_evaluator_types: Custom evaluator classes to use when deserializing the dataset. These are additional evaluators beyond the default ones. Returns: A new Dataset instance parsed from the string. Raises: ValidationError: If the content cannot be parsed as a valid dataset. """ if fmt == 'yaml': loaded = yaml.safe_load(contents) return cls.from_dict(loaded, custom_evaluator_types) else: dataset_model_type = cls._serialization_type() dataset_model = dataset_model_type.model_validate_json(contents) return cls._from_dataset_model(dataset_model, custom_evaluator_types) @classmethod def from_dict( cls, data: dict[str, Any], custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = (), ) -> Self: """Load a dataset from a dictionary. Args: data: Dictionary representation of the dataset. custom_evaluator_types: Custom evaluator classes to use when deserializing the dataset. These are additional evaluators beyond the default ones. Returns: A new Dataset instance created from the dictionary. Raises: ValidationError: If the dictionary cannot be converted to a valid dataset. """ dataset_model_type = cls._serialization_type() dataset_model = dataset_model_type.model_validate(data) return cls._from_dataset_model(dataset_model, custom_evaluator_types) @classmethod def _from_dataset_model( cls, dataset_model: _DatasetModel[InputsT, OutputT, MetadataT], custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = (), ) -> Self: """Create a Dataset from a _DatasetModel. Args: dataset_model: The _DatasetModel to convert. custom_evaluator_types: Custom evaluator classes to register for deserialization. Returns: A new Dataset instance created from the _DatasetModel. """ registry = _get_registry(custom_evaluator_types) cases: list[Case[InputsT, OutputT, MetadataT]] = [] errors: list[ValueError] = [] dataset_evaluators: list[Evaluator] = [] for spec in dataset_model.evaluators: try: dataset_evaluator = _load_evaluator_from_registry(registry, None, spec) except ValueError as e: errors.append(e) continue dataset_evaluators.append(dataset_evaluator) for row in dataset_model.cases: evaluators: list[Evaluator] = [] for spec in row.evaluators: try: evaluator = _load_evaluator_from_registry(registry, row.name, spec) except ValueError as e: errors.append(e) continue evaluators.append(evaluator) row = Case[InputsT, OutputT, MetadataT]( name=row.name, inputs=row.inputs, metadata=row.metadata, expected_output=row.expected_output, ) row.evaluators = evaluators cases.append(row) if errors: raise ExceptionGroup(f'{len(errors)} error(s) loading evaluators from registry', errors[:3]) result = cls(cases=cases) result.evaluators = dataset_evaluators return result def to_file( self, path: Path | str, fmt: Literal['yaml', 'json'] | None = None, schema_path: Path | str | None = DEFAULT_SCHEMA_PATH_TEMPLATE, custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = (), ): """Save the dataset to a file. Args: path: Path to save the dataset to. fmt: Format to use. If None, the format will be inferred from the file extension. Must be either 'yaml' or 'json'. schema_path: Path to save the JSON schema to. If None, no schema will be saved. Can be a string template with {stem} which will be replaced with the dataset filename stem. custom_evaluator_types: Custom evaluator classes to include in the schema. """ path = Path(path) fmt = self._infer_fmt(path, fmt) schema_ref: str | None = None if schema_path is not None: # pragma: no branch if isinstance(schema_path, str): # pragma: no branch schema_path = Path(schema_path.format(stem=path.stem)) if not schema_path.is_absolute(): schema_ref = str(schema_path) schema_path = path.parent / schema_path elif schema_path.is_relative_to(path): # pragma: no cover schema_ref = str(_get_relative_path_reference(schema_path, path)) else: # pragma: no cover schema_ref = str(schema_path) self._save_schema(schema_path, custom_evaluator_types) context: dict[str, Any] = {'use_short_form': True} if fmt == 'yaml': dumped_data = self.model_dump(mode='json', by_alias=True, exclude_defaults=True, context=context) content = yaml.dump(dumped_data, sort_keys=False) if schema_ref: # pragma: no branch yaml_language_server_line = f'{_YAML_SCHEMA_LINE_PREFIX}{schema_ref}' content = f'{yaml_language_server_line}\n{content}' path.write_text(content) else: context['$schema'] = schema_ref json_data = self.model_dump_json(indent=2, by_alias=True, exclude_defaults=True, context=context) path.write_text(json_data + '\n') @classmethod def model_json_schema_with_evaluators( cls, custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = (), ) -> dict[str, Any]: """Generate a JSON schema for this dataset type, including evaluator details. This is useful for generating a schema that can be used to validate YAML-format dataset files. Args: custom_evaluator_types: Custom evaluator classes to include in the schema. Returns: A dictionary representing the JSON schema. """ # Note: this function could maybe be simplified now that Evaluators are always dataclasses registry = _get_registry(custom_evaluator_types) evaluator_schema_types: list[Any] = [] for name, evaluator_class in registry.items(): type_hints = _typing_extra.get_function_type_hints(evaluator_class) type_hints.pop('return', None) required_type_hints: dict[str, Any] = {} for p in inspect.signature(evaluator_class).parameters.values(): type_hints.setdefault(p.name, Any) if p.default is not p.empty: type_hints[p.name] = NotRequired[type_hints[p.name]] else: required_type_hints[p.name] = type_hints[p.name] def _make_typed_dict(cls_name_prefix: str, fields: dict[str, Any]) -> Any: td = TypedDict(f'{cls_name_prefix}_{name}', fields) # pyright: ignore[reportArgumentType] config = ConfigDict(extra='forbid', arbitrary_types_allowed=True) # TODO: Replace with pydantic.with_config once pydantic 2.11 is the min supported version td.__pydantic_config__ = config # pyright: ignore[reportAttributeAccessIssue] return td # Shortest form: just the call name if len(type_hints) == 0 or not required_type_hints: evaluator_schema_types.append(Literal[name]) # Short form: can be called with only one parameter if len(type_hints) == 1: [type_hint_type] = type_hints.values() evaluator_schema_types.append(_make_typed_dict('short_evaluator', {name: type_hint_type})) elif len(required_type_hints) == 1: # pragma: no branch [type_hint_type] = required_type_hints.values() evaluator_schema_types.append(_make_typed_dict('short_evaluator', {name: type_hint_type})) # Long form: multiple parameters, possibly required if len(type_hints) > 1: params_td = _make_typed_dict('evaluator_params', type_hints) evaluator_schema_types.append(_make_typed_dict('evaluator', {name: params_td})) in_type, out_type, meta_type = cls._params() # Note: we shadow the `Case` and `Dataset` class names here to generate a clean JSON schema class Case(BaseModel, extra='forbid'): # pyright: ignore[reportUnusedClass] # this _is_ used below, but pyright doesn't seem to notice.. name: str | None = None inputs: in_type # pyright: ignore[reportInvalidTypeForm] metadata: meta_type | None = None # pyright: ignore[reportInvalidTypeForm,reportUnknownVariableType] expected_output: out_type | None = None # pyright: ignore[reportInvalidTypeForm,reportUnknownVariableType] if evaluator_schema_types: # pragma: no branch evaluators: list[Union[tuple(evaluator_schema_types)]] = [] # pyright: ignore # noqa UP007 class Dataset(BaseModel, extra='forbid'): cases: list[Case] if evaluator_schema_types: # pragma: no branch evaluators: list[Union[tuple(evaluator_schema_types)]] = [] # pyright: ignore # noqa UP007 json_schema = Dataset.model_json_schema() # See `_add_json_schema` below, since `$schema` is added to the JSON, it has to be supported in the JSON json_schema['properties']['$schema'] = {'type': 'string'} return json_schema @classmethod def _save_schema( cls, path: Path | str, custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = () ): """Save the JSON schema for this dataset type to a file. Args: path: Path to save the schema to. custom_evaluator_types: Custom evaluator classes to include in the schema. """ path = Path(path) json_schema = cls.model_json_schema_with_evaluators(custom_evaluator_types) schema_content = to_json(json_schema, indent=2).decode() + '\n' if not path.exists() or path.read_text() != schema_content: # pragma: no branch path.write_text(schema_content) @classmethod @functools.cache def _serialization_type(cls) -> type[_DatasetModel[InputsT, OutputT, MetadataT]]: """Get the serialization type for this dataset class. Returns: A _DatasetModel type with the same generic parameters as this Dataset class. """ input_type, output_type, metadata_type = cls._params() return _DatasetModel[input_type, output_type, metadata_type] @classmethod def _infer_fmt(cls, path: Path, fmt: Literal['yaml', 'json'] | None) -> Literal['yaml', 'json']: """Infer the format to use for a file based on its extension. Args: path: The path to infer the format for. fmt: The explicitly provided format, if any. Returns: The inferred format ('yaml' or 'json'). Raises: ValueError: If the format cannot be inferred from the file extension. """ if fmt is not None: return fmt suffix = path.suffix.lower() if suffix in {'.yaml', '.yml'}: return 'yaml' elif suffix == '.json': return 'json' raise ValueError( f'Could not infer format for filename {path.name!r}. Use the `fmt` argument to specify the format.' ) @model_serializer(mode='wrap') def _add_json_schema(self, nxt: SerializerFunctionWrapHandler, info: SerializationInfo) -> dict[str, Any]: """Add the JSON schema path to the serialized output. See for context, that seems to be the nearest there is to a spec for this. """ context = cast(dict[str, Any] | None, info.context) if isinstance(context, dict) and (schema := context.get('$schema')): return {'$schema': schema} | nxt(self) else: return nxt(self) ```` #### cases ```python cases: list[Case[InputsT, OutputT, MetadataT]] ``` List of test cases in the dataset. #### evaluators ```python evaluators: list[Evaluator[InputsT, OutputT, MetadataT]] = ( [] ) ``` List of evaluators to be used on all cases in the dataset. #### __init__ ```python __init__( *, cases: Sequence[Case[InputsT, OutputT, MetadataT]], evaluators: Sequence[ Evaluator[InputsT, OutputT, MetadataT] ] = () ) ``` Initialize a new dataset with test cases and optional evaluators. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `cases` | `Sequence[Case[InputsT, OutputT, MetadataT]]` | Sequence of test cases to include in the dataset. | *required* | | `evaluators` | `Sequence[Evaluator[InputsT, OutputT, MetadataT]]` | Optional sequence of evaluators to apply to all cases in the dataset. | `()` | Source code in `pydantic_evals/pydantic_evals/dataset.py` ```python def __init__( self, *, cases: Sequence[Case[InputsT, OutputT, MetadataT]], evaluators: Sequence[Evaluator[InputsT, OutputT, MetadataT]] = (), ): """Initialize a new dataset with test cases and optional evaluators. Args: cases: Sequence of test cases to include in the dataset. evaluators: Optional sequence of evaluators to apply to all cases in the dataset. """ case_names = set[str]() for case in cases: if case.name is None: continue if case.name in case_names: raise ValueError(f'Duplicate case name: {case.name!r}') case_names.add(case.name) super().__init__( cases=cases, evaluators=list(evaluators), ) ``` #### evaluate ```python evaluate( task: ( Callable[[InputsT], Awaitable[OutputT]] | Callable[[InputsT], OutputT] ), name: str | None = None, max_concurrency: int | None = None, progress: bool = True, retry_task: RetryConfig | None = None, retry_evaluators: RetryConfig | None = None, ) -> EvaluationReport[InputsT, OutputT, MetadataT] ``` Evaluates the test cases in the dataset using the given task. This method runs the task on each case in the dataset, applies evaluators, and collects results into a report. Cases are run concurrently, limited by `max_concurrency` if specified. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `task` | `Callable[[InputsT], Awaitable[OutputT]] | Callable[[InputsT], OutputT]` | The task to evaluate. This should be a callable that takes the inputs of the case and returns the output. | *required* | | `name` | `str | None` | The name of the task being evaluated, this is used to identify the task in the report. If omitted, the name of the task function will be used. | `None` | | `max_concurrency` | `int | None` | The maximum number of concurrent evaluations of the task to allow. If None, all cases will be evaluated concurrently. | `None` | | `progress` | `bool` | Whether to show a progress bar for the evaluation. Defaults to True. | `True` | | `retry_task` | `RetryConfig | None` | Optional retry configuration for the task execution. | `None` | | `retry_evaluators` | `RetryConfig | None` | Optional retry configuration for evaluator execution. | `None` | Returns: | Type | Description | | --- | --- | | `EvaluationReport[InputsT, OutputT, MetadataT]` | A report containing the results of the evaluation. | Source code in `pydantic_evals/pydantic_evals/dataset.py` ```python async def evaluate( self, task: Callable[[InputsT], Awaitable[OutputT]] | Callable[[InputsT], OutputT], name: str | None = None, max_concurrency: int | None = None, progress: bool = True, retry_task: RetryConfig | None = None, retry_evaluators: RetryConfig | None = None, ) -> EvaluationReport[InputsT, OutputT, MetadataT]: """Evaluates the test cases in the dataset using the given task. This method runs the task on each case in the dataset, applies evaluators, and collects results into a report. Cases are run concurrently, limited by `max_concurrency` if specified. Args: task: The task to evaluate. This should be a callable that takes the inputs of the case and returns the output. name: The name of the task being evaluated, this is used to identify the task in the report. If omitted, the name of the task function will be used. max_concurrency: The maximum number of concurrent evaluations of the task to allow. If None, all cases will be evaluated concurrently. progress: Whether to show a progress bar for the evaluation. Defaults to `True`. retry_task: Optional retry configuration for the task execution. retry_evaluators: Optional retry configuration for evaluator execution. Returns: A report containing the results of the evaluation. """ name = name or get_unwrapped_function_name(task) total_cases = len(self.cases) progress_bar = Progress() if progress else None limiter = anyio.Semaphore(max_concurrency) if max_concurrency is not None else AsyncExitStack() with ( logfire_span('evaluate {name}', name=name, n_cases=len(self.cases)) as eval_span, progress_bar or nullcontext(), ): task_id = progress_bar.add_task(f'Evaluating {name}', total=total_cases) if progress_bar else None async def _handle_case(case: Case[InputsT, OutputT, MetadataT], report_case_name: str): async with limiter: result = await _run_task_and_evaluators( task, case, report_case_name, self.evaluators, retry_task, retry_evaluators ) if progress_bar and task_id is not None: # pragma: no branch progress_bar.update(task_id, advance=1) return result if (context := eval_span.context) is None: # pragma: no cover trace_id = None span_id = None else: trace_id = f'{context.trace_id:032x}' span_id = f'{context.span_id:016x}' cases_and_failures = await task_group_gather( [ lambda case=case, i=i: _handle_case(case, case.name or f'Case {i}') for i, case in enumerate(self.cases, 1) ] ) cases: list[ReportCase] = [] failures: list[ReportCaseFailure] = [] for item in cases_and_failures: if isinstance(item, ReportCase): cases.append(item) else: failures.append(item) report = EvaluationReport( name=name, cases=cases, failures=failures, span_id=span_id, trace_id=trace_id, ) if (averages := report.averages()) is not None and averages.assertions is not None: eval_span.set_attribute('assertion_pass_rate', averages.assertions) return report ``` #### evaluate_sync ```python evaluate_sync( task: ( Callable[[InputsT], Awaitable[OutputT]] | Callable[[InputsT], OutputT] ), name: str | None = None, max_concurrency: int | None = None, progress: bool = True, ) -> EvaluationReport[InputsT, OutputT, MetadataT] ``` Evaluates the test cases in the dataset using the given task. This is a synchronous wrapper around evaluate provided for convenience. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `task` | `Callable[[InputsT], Awaitable[OutputT]] | Callable[[InputsT], OutputT]` | The task to evaluate. This should be a callable that takes the inputs of the case and returns the output. | *required* | | `name` | `str | None` | The name of the task being evaluated, this is used to identify the task in the report. If omitted, the name of the task function will be used. | `None` | | `max_concurrency` | `int | None` | The maximum number of concurrent evaluations of the task to allow. If None, all cases will be evaluated concurrently. | `None` | | `progress` | `bool` | Whether to show a progress bar for the evaluation. Defaults to True. | `True` | Returns: | Type | Description | | --- | --- | | `EvaluationReport[InputsT, OutputT, MetadataT]` | A report containing the results of the evaluation. | Source code in `pydantic_evals/pydantic_evals/dataset.py` ```python def evaluate_sync( self, task: Callable[[InputsT], Awaitable[OutputT]] | Callable[[InputsT], OutputT], name: str | None = None, max_concurrency: int | None = None, progress: bool = True, ) -> EvaluationReport[InputsT, OutputT, MetadataT]: """Evaluates the test cases in the dataset using the given task. This is a synchronous wrapper around [`evaluate`][pydantic_evals.Dataset.evaluate] provided for convenience. Args: task: The task to evaluate. This should be a callable that takes the inputs of the case and returns the output. name: The name of the task being evaluated, this is used to identify the task in the report. If omitted, the name of the task function will be used. max_concurrency: The maximum number of concurrent evaluations of the task to allow. If None, all cases will be evaluated concurrently. progress: Whether to show a progress bar for the evaluation. Defaults to True. Returns: A report containing the results of the evaluation. """ return get_event_loop().run_until_complete( self.evaluate(task, name=name, max_concurrency=max_concurrency, progress=progress) ) ``` #### add_case ```python add_case( *, name: str | None = None, inputs: InputsT, metadata: MetadataT | None = None, expected_output: OutputT | None = None, evaluators: tuple[ Evaluator[InputsT, OutputT, MetadataT], ... ] = () ) -> None ``` Adds a case to the dataset. This is a convenience method for creating a Case and adding it to the dataset. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `name` | `str | None` | Optional name for the case. If not provided, a generic name will be assigned. | `None` | | `inputs` | `InputsT` | The inputs to the task being evaluated. | *required* | | `metadata` | `MetadataT | None` | Optional metadata for the case, which can be used by evaluators. | `None` | | `expected_output` | `OutputT | None` | The expected output of the task, used for comparison in evaluators. | `None` | | `evaluators` | `tuple[Evaluator[InputsT, OutputT, MetadataT], ...]` | Tuple of evaluators specific to this case, in addition to dataset-level evaluators. | `()` | Source code in `pydantic_evals/pydantic_evals/dataset.py` ```python def add_case( self, *, name: str | None = None, inputs: InputsT, metadata: MetadataT | None = None, expected_output: OutputT | None = None, evaluators: tuple[Evaluator[InputsT, OutputT, MetadataT], ...] = (), ) -> None: """Adds a case to the dataset. This is a convenience method for creating a [`Case`][pydantic_evals.Case] and adding it to the dataset. Args: name: Optional name for the case. If not provided, a generic name will be assigned. inputs: The inputs to the task being evaluated. metadata: Optional metadata for the case, which can be used by evaluators. expected_output: The expected output of the task, used for comparison in evaluators. evaluators: Tuple of evaluators specific to this case, in addition to dataset-level evaluators. """ if name in {case.name for case in self.cases}: raise ValueError(f'Duplicate case name: {name!r}') case = Case[InputsT, OutputT, MetadataT]( name=name, inputs=inputs, metadata=metadata, expected_output=expected_output, evaluators=evaluators, ) self.cases.append(case) ``` #### add_evaluator ```python add_evaluator( evaluator: Evaluator[InputsT, OutputT, MetadataT], specific_case: str | None = None, ) -> None ``` Adds an evaluator to the dataset or a specific case. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `evaluator` | `Evaluator[InputsT, OutputT, MetadataT]` | The evaluator to add. | *required* | | `specific_case` | `str | None` | If provided, the evaluator will only be added to the case with this name. If None, the evaluator will be added to all cases in the dataset. | `None` | Raises: | Type | Description | | --- | --- | | `ValueError` | If specific_case is provided but no case with that name exists in the dataset. | Source code in `pydantic_evals/pydantic_evals/dataset.py` ```python def add_evaluator( self, evaluator: Evaluator[InputsT, OutputT, MetadataT], specific_case: str | None = None, ) -> None: """Adds an evaluator to the dataset or a specific case. Args: evaluator: The evaluator to add. specific_case: If provided, the evaluator will only be added to the case with this name. If None, the evaluator will be added to all cases in the dataset. Raises: ValueError: If `specific_case` is provided but no case with that name exists in the dataset. """ if specific_case is None: self.evaluators.append(evaluator) else: # If this is too slow, we could try to add a case lookup dict. # Note that if we do that, we'd need to make the cases list private to prevent modification. added = False for case in self.cases: if case.name == specific_case: case.evaluators.append(evaluator) added = True if not added: raise ValueError(f'Case {specific_case!r} not found in the dataset') ``` #### from_file ```python from_file( path: Path | str, fmt: Literal["yaml", "json"] | None = None, custom_evaluator_types: Sequence[ type[Evaluator[InputsT, OutputT, MetadataT]] ] = (), ) -> Self ``` Load a dataset from a file. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `path` | `Path | str` | Path to the file to load. | *required* | | `fmt` | `Literal['yaml', 'json'] | None` | Format of the file. If None, the format will be inferred from the file extension. Must be either 'yaml' or 'json'. | `None` | | `custom_evaluator_types` | `Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]]` | Custom evaluator classes to use when deserializing the dataset. These are additional evaluators beyond the default ones. | `()` | Returns: | Type | Description | | --- | --- | | `Self` | A new Dataset instance loaded from the file. | Raises: | Type | Description | | --- | --- | | `ValidationError` | If the file cannot be parsed as a valid dataset. | | `ValueError` | If the format cannot be inferred from the file extension. | Source code in `pydantic_evals/pydantic_evals/dataset.py` ```python @classmethod def from_file( cls, path: Path | str, fmt: Literal['yaml', 'json'] | None = None, custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = (), ) -> Self: """Load a dataset from a file. Args: path: Path to the file to load. fmt: Format of the file. If None, the format will be inferred from the file extension. Must be either 'yaml' or 'json'. custom_evaluator_types: Custom evaluator classes to use when deserializing the dataset. These are additional evaluators beyond the default ones. Returns: A new Dataset instance loaded from the file. Raises: ValidationError: If the file cannot be parsed as a valid dataset. ValueError: If the format cannot be inferred from the file extension. """ path = Path(path) fmt = cls._infer_fmt(path, fmt) raw = Path(path).read_text() try: return cls.from_text(raw, fmt=fmt, custom_evaluator_types=custom_evaluator_types) except ValidationError as e: # pragma: no cover raise ValueError(f'{path} contains data that does not match the schema for {cls.__name__}:\n{e}.') from e ``` #### from_text ```python from_text( contents: str, fmt: Literal["yaml", "json"] = "yaml", custom_evaluator_types: Sequence[ type[Evaluator[InputsT, OutputT, MetadataT]] ] = (), ) -> Self ``` Load a dataset from a string. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `contents` | `str` | The string content to parse. | *required* | | `fmt` | `Literal['yaml', 'json']` | Format of the content. Must be either 'yaml' or 'json'. | `'yaml'` | | `custom_evaluator_types` | `Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]]` | Custom evaluator classes to use when deserializing the dataset. These are additional evaluators beyond the default ones. | `()` | Returns: | Type | Description | | --- | --- | | `Self` | A new Dataset instance parsed from the string. | Raises: | Type | Description | | --- | --- | | `ValidationError` | If the content cannot be parsed as a valid dataset. | Source code in `pydantic_evals/pydantic_evals/dataset.py` ```python @classmethod def from_text( cls, contents: str, fmt: Literal['yaml', 'json'] = 'yaml', custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = (), ) -> Self: """Load a dataset from a string. Args: contents: The string content to parse. fmt: Format of the content. Must be either 'yaml' or 'json'. custom_evaluator_types: Custom evaluator classes to use when deserializing the dataset. These are additional evaluators beyond the default ones. Returns: A new Dataset instance parsed from the string. Raises: ValidationError: If the content cannot be parsed as a valid dataset. """ if fmt == 'yaml': loaded = yaml.safe_load(contents) return cls.from_dict(loaded, custom_evaluator_types) else: dataset_model_type = cls._serialization_type() dataset_model = dataset_model_type.model_validate_json(contents) return cls._from_dataset_model(dataset_model, custom_evaluator_types) ``` #### from_dict ```python from_dict( data: dict[str, Any], custom_evaluator_types: Sequence[ type[Evaluator[InputsT, OutputT, MetadataT]] ] = (), ) -> Self ``` Load a dataset from a dictionary. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `data` | `dict[str, Any]` | Dictionary representation of the dataset. | *required* | | `custom_evaluator_types` | `Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]]` | Custom evaluator classes to use when deserializing the dataset. These are additional evaluators beyond the default ones. | `()` | Returns: | Type | Description | | --- | --- | | `Self` | A new Dataset instance created from the dictionary. | Raises: | Type | Description | | --- | --- | | `ValidationError` | If the dictionary cannot be converted to a valid dataset. | Source code in `pydantic_evals/pydantic_evals/dataset.py` ```python @classmethod def from_dict( cls, data: dict[str, Any], custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = (), ) -> Self: """Load a dataset from a dictionary. Args: data: Dictionary representation of the dataset. custom_evaluator_types: Custom evaluator classes to use when deserializing the dataset. These are additional evaluators beyond the default ones. Returns: A new Dataset instance created from the dictionary. Raises: ValidationError: If the dictionary cannot be converted to a valid dataset. """ dataset_model_type = cls._serialization_type() dataset_model = dataset_model_type.model_validate(data) return cls._from_dataset_model(dataset_model, custom_evaluator_types) ``` #### to_file ```python to_file( path: Path | str, fmt: Literal["yaml", "json"] | None = None, schema_path: ( Path | str | None ) = DEFAULT_SCHEMA_PATH_TEMPLATE, custom_evaluator_types: Sequence[ type[Evaluator[InputsT, OutputT, MetadataT]] ] = (), ) ``` Save the dataset to a file. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `path` | `Path | str` | Path to save the dataset to. | *required* | | `fmt` | `Literal['yaml', 'json'] | None` | Format to use. If None, the format will be inferred from the file extension. Must be either 'yaml' or 'json'. | `None` | | `schema_path` | `Path | str | None` | Path to save the JSON schema to. If None, no schema will be saved. Can be a string template with {stem} which will be replaced with the dataset filename stem. | `DEFAULT_SCHEMA_PATH_TEMPLATE` | | `custom_evaluator_types` | `Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]]` | Custom evaluator classes to include in the schema. | `()` | Source code in `pydantic_evals/pydantic_evals/dataset.py` ```python def to_file( self, path: Path | str, fmt: Literal['yaml', 'json'] | None = None, schema_path: Path | str | None = DEFAULT_SCHEMA_PATH_TEMPLATE, custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = (), ): """Save the dataset to a file. Args: path: Path to save the dataset to. fmt: Format to use. If None, the format will be inferred from the file extension. Must be either 'yaml' or 'json'. schema_path: Path to save the JSON schema to. If None, no schema will be saved. Can be a string template with {stem} which will be replaced with the dataset filename stem. custom_evaluator_types: Custom evaluator classes to include in the schema. """ path = Path(path) fmt = self._infer_fmt(path, fmt) schema_ref: str | None = None if schema_path is not None: # pragma: no branch if isinstance(schema_path, str): # pragma: no branch schema_path = Path(schema_path.format(stem=path.stem)) if not schema_path.is_absolute(): schema_ref = str(schema_path) schema_path = path.parent / schema_path elif schema_path.is_relative_to(path): # pragma: no cover schema_ref = str(_get_relative_path_reference(schema_path, path)) else: # pragma: no cover schema_ref = str(schema_path) self._save_schema(schema_path, custom_evaluator_types) context: dict[str, Any] = {'use_short_form': True} if fmt == 'yaml': dumped_data = self.model_dump(mode='json', by_alias=True, exclude_defaults=True, context=context) content = yaml.dump(dumped_data, sort_keys=False) if schema_ref: # pragma: no branch yaml_language_server_line = f'{_YAML_SCHEMA_LINE_PREFIX}{schema_ref}' content = f'{yaml_language_server_line}\n{content}' path.write_text(content) else: context['$schema'] = schema_ref json_data = self.model_dump_json(indent=2, by_alias=True, exclude_defaults=True, context=context) path.write_text(json_data + '\n') ``` #### model_json_schema_with_evaluators ```python model_json_schema_with_evaluators( custom_evaluator_types: Sequence[ type[Evaluator[InputsT, OutputT, MetadataT]] ] = (), ) -> dict[str, Any] ``` Generate a JSON schema for this dataset type, including evaluator details. This is useful for generating a schema that can be used to validate YAML-format dataset files. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `custom_evaluator_types` | `Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]]` | Custom evaluator classes to include in the schema. | `()` | Returns: | Type | Description | | --- | --- | | `dict[str, Any]` | A dictionary representing the JSON schema. | Source code in `pydantic_evals/pydantic_evals/dataset.py` ```python @classmethod def model_json_schema_with_evaluators( cls, custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = (), ) -> dict[str, Any]: """Generate a JSON schema for this dataset type, including evaluator details. This is useful for generating a schema that can be used to validate YAML-format dataset files. Args: custom_evaluator_types: Custom evaluator classes to include in the schema. Returns: A dictionary representing the JSON schema. """ # Note: this function could maybe be simplified now that Evaluators are always dataclasses registry = _get_registry(custom_evaluator_types) evaluator_schema_types: list[Any] = [] for name, evaluator_class in registry.items(): type_hints = _typing_extra.get_function_type_hints(evaluator_class) type_hints.pop('return', None) required_type_hints: dict[str, Any] = {} for p in inspect.signature(evaluator_class).parameters.values(): type_hints.setdefault(p.name, Any) if p.default is not p.empty: type_hints[p.name] = NotRequired[type_hints[p.name]] else: required_type_hints[p.name] = type_hints[p.name] def _make_typed_dict(cls_name_prefix: str, fields: dict[str, Any]) -> Any: td = TypedDict(f'{cls_name_prefix}_{name}', fields) # pyright: ignore[reportArgumentType] config = ConfigDict(extra='forbid', arbitrary_types_allowed=True) # TODO: Replace with pydantic.with_config once pydantic 2.11 is the min supported version td.__pydantic_config__ = config # pyright: ignore[reportAttributeAccessIssue] return td # Shortest form: just the call name if len(type_hints) == 0 or not required_type_hints: evaluator_schema_types.append(Literal[name]) # Short form: can be called with only one parameter if len(type_hints) == 1: [type_hint_type] = type_hints.values() evaluator_schema_types.append(_make_typed_dict('short_evaluator', {name: type_hint_type})) elif len(required_type_hints) == 1: # pragma: no branch [type_hint_type] = required_type_hints.values() evaluator_schema_types.append(_make_typed_dict('short_evaluator', {name: type_hint_type})) # Long form: multiple parameters, possibly required if len(type_hints) > 1: params_td = _make_typed_dict('evaluator_params', type_hints) evaluator_schema_types.append(_make_typed_dict('evaluator', {name: params_td})) in_type, out_type, meta_type = cls._params() # Note: we shadow the `Case` and `Dataset` class names here to generate a clean JSON schema class Case(BaseModel, extra='forbid'): # pyright: ignore[reportUnusedClass] # this _is_ used below, but pyright doesn't seem to notice.. name: str | None = None inputs: in_type # pyright: ignore[reportInvalidTypeForm] metadata: meta_type | None = None # pyright: ignore[reportInvalidTypeForm,reportUnknownVariableType] expected_output: out_type | None = None # pyright: ignore[reportInvalidTypeForm,reportUnknownVariableType] if evaluator_schema_types: # pragma: no branch evaluators: list[Union[tuple(evaluator_schema_types)]] = [] # pyright: ignore # noqa UP007 class Dataset(BaseModel, extra='forbid'): cases: list[Case] if evaluator_schema_types: # pragma: no branch evaluators: list[Union[tuple(evaluator_schema_types)]] = [] # pyright: ignore # noqa UP007 json_schema = Dataset.model_json_schema() # See `_add_json_schema` below, since `$schema` is added to the JSON, it has to be supported in the JSON json_schema['properties']['$schema'] = {'type': 'string'} return json_schema ``` ### set_eval_attribute ```python set_eval_attribute(name: str, value: Any) -> None ``` Set an attribute on the current task run. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `name` | `str` | The name of the attribute. | *required* | | `value` | `Any` | The value of the attribute. | *required* | Source code in `pydantic_evals/pydantic_evals/dataset.py` ```python def set_eval_attribute(name: str, value: Any) -> None: """Set an attribute on the current task run. Args: name: The name of the attribute. value: The value of the attribute. """ current_case = _CURRENT_TASK_RUN.get() if current_case is not None: # pragma: no branch current_case.record_attribute(name, value) ``` ### increment_eval_metric ```python increment_eval_metric( name: str, amount: int | float ) -> None ``` Increment a metric on the current task run. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `name` | `str` | The name of the metric. | *required* | | `amount` | `int | float` | The amount to increment by. | *required* | Source code in `pydantic_evals/pydantic_evals/dataset.py` ```python def increment_eval_metric(name: str, amount: int | float) -> None: """Increment a metric on the current task run. Args: name: The name of the metric. amount: The amount to increment by. """ current_case = _CURRENT_TASK_RUN.get() if current_case is not None: # pragma: no branch current_case.increment_metric(name, amount) ``` # `pydantic_evals.evaluators` ### Contains Bases: `Evaluator[object, object, object]` Check if the output contains the expected output. For strings, checks if expected_output is a substring of output. For lists/tuples, checks if expected_output is in output. For dicts, checks if all key-value pairs in expected_output are in output. Note: case_sensitive only applies when both the value and output are strings. Source code in `pydantic_evals/pydantic_evals/evaluators/common.py` ```python @dataclass(repr=False) class Contains(Evaluator[object, object, object]): """Check if the output contains the expected output. For strings, checks if expected_output is a substring of output. For lists/tuples, checks if expected_output is in output. For dicts, checks if all key-value pairs in expected_output are in output. Note: case_sensitive only applies when both the value and output are strings. """ value: Any case_sensitive: bool = True as_strings: bool = False evaluation_name: str | None = field(default=None) def evaluate( self, ctx: EvaluatorContext[object, object, object], ) -> EvaluationReason: # Convert objects to strings if requested failure_reason: str | None = None as_strings = self.as_strings or (isinstance(self.value, str) and isinstance(ctx.output, str)) if as_strings: output_str = str(ctx.output) expected_str = str(self.value) if not self.case_sensitive: output_str = output_str.lower() expected_str = expected_str.lower() failure_reason: str | None = None if expected_str not in output_str: output_trunc = _truncated_repr(output_str, max_length=100) expected_trunc = _truncated_repr(expected_str, max_length=100) failure_reason = f'Output string {output_trunc} does not contain expected string {expected_trunc}' return EvaluationReason(value=failure_reason is None, reason=failure_reason) try: # Handle different collection types if isinstance(ctx.output, dict): if isinstance(self.value, dict): # Cast to Any to avoid type checking issues output_dict = cast(dict[Any, Any], ctx.output) # pyright: ignore[reportUnknownMemberType] expected_dict = cast(dict[Any, Any], self.value) # pyright: ignore[reportUnknownMemberType] for k in expected_dict: if k not in output_dict: k_trunc = _truncated_repr(k, max_length=30) failure_reason = f'Output dictionary does not contain expected key {k_trunc}' break elif output_dict[k] != expected_dict[k]: k_trunc = _truncated_repr(k, max_length=30) output_v_trunc = _truncated_repr(output_dict[k], max_length=100) expected_v_trunc = _truncated_repr(expected_dict[k], max_length=100) failure_reason = f'Output dictionary has different value for key {k_trunc}: {output_v_trunc} != {expected_v_trunc}' break else: if self.value not in ctx.output: # pyright: ignore[reportUnknownMemberType] output_trunc = _truncated_repr(ctx.output, max_length=200) # pyright: ignore[reportUnknownMemberType] failure_reason = f'Output {output_trunc} does not contain provided value as a key' elif self.value not in ctx.output: # pyright: ignore[reportOperatorIssue] # will be handled by except block output_trunc = _truncated_repr(ctx.output, max_length=200) failure_reason = f'Output {output_trunc} does not contain provided value' except (TypeError, ValueError) as e: failure_reason = f'Containment check failed: {e}' return EvaluationReason(value=failure_reason is None, reason=failure_reason) ``` ### Equals Bases: `Evaluator[object, object, object]` Check if the output exactly equals the provided value. Source code in `pydantic_evals/pydantic_evals/evaluators/common.py` ```python @dataclass(repr=False) class Equals(Evaluator[object, object, object]): """Check if the output exactly equals the provided value.""" value: Any evaluation_name: str | None = field(default=None) def evaluate(self, ctx: EvaluatorContext[object, object, object]) -> bool: return ctx.output == self.value ``` ### EqualsExpected Bases: `Evaluator[object, object, object]` Check if the output exactly equals the expected output. Source code in `pydantic_evals/pydantic_evals/evaluators/common.py` ```python @dataclass(repr=False) class EqualsExpected(Evaluator[object, object, object]): """Check if the output exactly equals the expected output.""" evaluation_name: str | None = field(default=None) def evaluate(self, ctx: EvaluatorContext[object, object, object]) -> bool | dict[str, bool]: if ctx.expected_output is None: return {} # Only compare if expected output is provided return ctx.output == ctx.expected_output ``` ### HasMatchingSpan Bases: `Evaluator[object, object, object]` Check if the span tree contains a span that matches the specified query. Source code in `pydantic_evals/pydantic_evals/evaluators/common.py` ```python @dataclass(repr=False) class HasMatchingSpan(Evaluator[object, object, object]): """Check if the span tree contains a span that matches the specified query.""" query: SpanQuery evaluation_name: str | None = field(default=None) def evaluate( self, ctx: EvaluatorContext[object, object, object], ) -> bool: return ctx.span_tree.any(self.query) ``` ### IsInstance Bases: `Evaluator[object, object, object]` Check if the output is an instance of a type with the given name. Source code in `pydantic_evals/pydantic_evals/evaluators/common.py` ```python @dataclass(repr=False) class IsInstance(Evaluator[object, object, object]): """Check if the output is an instance of a type with the given name.""" type_name: str evaluation_name: str | None = field(default=None) def evaluate(self, ctx: EvaluatorContext[object, object, object]) -> EvaluationReason: output = ctx.output for cls in type(output).__mro__: if cls.__name__ == self.type_name or cls.__qualname__ == self.type_name: return EvaluationReason(value=True) reason = f'output is of type {type(output).__name__}' if type(output).__qualname__ != type(output).__name__: reason += f' (qualname: {type(output).__qualname__})' return EvaluationReason(value=False, reason=reason) ``` ### LLMJudge Bases: `Evaluator[object, object, object]` Judge whether the output of a language model meets the criteria of a provided rubric. If you do not specify a model, it uses the default model for judging. This starts as 'openai:gpt-4o', but can be overridden by calling set_default_judge_model. Source code in `pydantic_evals/pydantic_evals/evaluators/common.py` ```python @dataclass(repr=False) class LLMJudge(Evaluator[object, object, object]): """Judge whether the output of a language model meets the criteria of a provided rubric. If you do not specify a model, it uses the default model for judging. This starts as 'openai:gpt-4o', but can be overridden by calling [`set_default_judge_model`][pydantic_evals.evaluators.llm_as_a_judge.set_default_judge_model]. """ rubric: str model: models.Model | models.KnownModelName | None = None include_input: bool = False include_expected_output: bool = False model_settings: ModelSettings | None = None score: OutputConfig | Literal[False] = False assertion: OutputConfig | Literal[False] = field(default_factory=lambda: OutputConfig(include_reason=True)) async def evaluate( self, ctx: EvaluatorContext[object, object, object], ) -> EvaluatorOutput: if self.include_input: if self.include_expected_output: from .llm_as_a_judge import judge_input_output_expected grading_output = await judge_input_output_expected( ctx.inputs, ctx.output, ctx.expected_output, self.rubric, self.model, self.model_settings ) else: from .llm_as_a_judge import judge_input_output grading_output = await judge_input_output( ctx.inputs, ctx.output, self.rubric, self.model, self.model_settings ) else: if self.include_expected_output: from .llm_as_a_judge import judge_output_expected grading_output = await judge_output_expected( ctx.output, ctx.expected_output, self.rubric, self.model, self.model_settings ) else: from .llm_as_a_judge import judge_output grading_output = await judge_output(ctx.output, self.rubric, self.model, self.model_settings) output: dict[str, EvaluationScalar | EvaluationReason] = {} include_both = self.score is not False and self.assertion is not False evaluation_name = self.get_default_evaluation_name() if self.score is not False: default_name = f'{evaluation_name}_score' if include_both else evaluation_name _update_combined_output(output, grading_output.score, grading_output.reason, self.score, default_name) if self.assertion is not False: default_name = f'{evaluation_name}_pass' if include_both else evaluation_name _update_combined_output(output, grading_output.pass_, grading_output.reason, self.assertion, default_name) return output def build_serialization_arguments(self): result = super().build_serialization_arguments() # always serialize the model as a string when present; use its name if it's a KnownModelName if (model := result.get('model')) and isinstance(model, models.Model): # pragma: no branch result['model'] = f'{model.system}:{model.model_name}' # Note: this may lead to confusion if you try to serialize-then-deserialize with a custom model. # I expect that is rare enough to be worth not solving yet, but common enough that we probably will want to # solve it eventually. I'm imagining some kind of model registry, but don't want to work out the details yet. return result ``` ### MaxDuration Bases: `Evaluator[object, object, object]` Check if the execution time is under the specified maximum. Source code in `pydantic_evals/pydantic_evals/evaluators/common.py` ```python @dataclass(repr=False) class MaxDuration(Evaluator[object, object, object]): """Check if the execution time is under the specified maximum.""" seconds: float | timedelta def evaluate(self, ctx: EvaluatorContext[object, object, object]) -> bool: duration = timedelta(seconds=ctx.duration) seconds = self.seconds if not isinstance(seconds, timedelta): seconds = timedelta(seconds=seconds) return duration <= seconds ``` ### OutputConfig Bases: `TypedDict` Configuration for the score and assertion outputs of the LLMJudge evaluator. Source code in `pydantic_evals/pydantic_evals/evaluators/common.py` ```python class OutputConfig(TypedDict, total=False): """Configuration for the score and assertion outputs of the LLMJudge evaluator.""" evaluation_name: str include_reason: bool ``` ### EvaluatorContext Bases: `Generic[InputsT, OutputT, MetadataT]` Context for evaluating a task execution. An instance of this class is the sole input to all Evaluators. It contains all the information needed to evaluate the task execution, including inputs, outputs, metadata, and telemetry data. Evaluators use this context to access the task inputs, actual output, expected output, and other information when evaluating the result of the task execution. Example: ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class ExactMatch(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> bool: # Use the context to access task inputs, outputs, and expected outputs return ctx.output == ctx.expected_output ``` Source code in `pydantic_evals/pydantic_evals/evaluators/context.py` ````python @dataclass(kw_only=True) class EvaluatorContext(Generic[InputsT, OutputT, MetadataT]): """Context for evaluating a task execution. An instance of this class is the sole input to all Evaluators. It contains all the information needed to evaluate the task execution, including inputs, outputs, metadata, and telemetry data. Evaluators use this context to access the task inputs, actual output, expected output, and other information when evaluating the result of the task execution. Example: ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class ExactMatch(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> bool: # Use the context to access task inputs, outputs, and expected outputs return ctx.output == ctx.expected_output ``` """ name: str | None """The name of the case.""" inputs: InputsT """The inputs provided to the task for this case.""" metadata: MetadataT | None """Metadata associated with the case, if provided. May be None if no metadata was specified.""" expected_output: OutputT | None """The expected output for the case, if provided. May be None if no expected output was specified.""" output: OutputT """The actual output produced by the task for this case.""" duration: float """The duration of the task run for this case.""" _span_tree: SpanTree | SpanTreeRecordingError = field(repr=False) """The span tree for the task run for this case. This will be `None` if `logfire.configure` has not been called. """ attributes: dict[str, Any] """Attributes associated with the task run for this case. These can be set by calling `pydantic_evals.dataset.set_eval_attribute` in any code executed during the evaluation task.""" metrics: dict[str, int | float] """Metrics associated with the task run for this case. These can be set by calling `pydantic_evals.dataset.increment_eval_metric` in any code executed during the evaluation task.""" @property def span_tree(self) -> SpanTree: """Get the `SpanTree` for this task execution. The span tree is a graph where each node corresponds to an OpenTelemetry span recorded during the task execution, including timing information and any custom spans created during execution. Returns: The span tree for the task execution. Raises: SpanTreeRecordingError: If spans were not captured during execution of the task, e.g. due to not having the necessary dependencies installed. """ if isinstance(self._span_tree, SpanTreeRecordingError): # In this case, there was a reason we couldn't record the SpanTree. We raise that now raise self._span_tree return self._span_tree ```` #### name ```python name: str | None ``` The name of the case. #### inputs ```python inputs: InputsT ``` The inputs provided to the task for this case. #### metadata ```python metadata: MetadataT | None ``` Metadata associated with the case, if provided. May be None if no metadata was specified. #### expected_output ```python expected_output: OutputT | None ``` The expected output for the case, if provided. May be None if no expected output was specified. #### output ```python output: OutputT ``` The actual output produced by the task for this case. #### duration ```python duration: float ``` The duration of the task run for this case. #### attributes ```python attributes: dict[str, Any] ``` Attributes associated with the task run for this case. These can be set by calling `pydantic_evals.dataset.set_eval_attribute` in any code executed during the evaluation task. #### metrics ```python metrics: dict[str, int | float] ``` Metrics associated with the task run for this case. These can be set by calling `pydantic_evals.dataset.increment_eval_metric` in any code executed during the evaluation task. #### span_tree ```python span_tree: SpanTree ``` Get the `SpanTree` for this task execution. The span tree is a graph where each node corresponds to an OpenTelemetry span recorded during the task execution, including timing information and any custom spans created during execution. Returns: | Type | Description | | --- | --- | | `SpanTree` | The span tree for the task execution. | Raises: | Type | Description | | --- | --- | | `SpanTreeRecordingError` | If spans were not captured during execution of the task, e.g. due to not having the necessary dependencies installed. | ### EvaluationReason The result of running an evaluator with an optional explanation. Contains a scalar value and an optional "reason" explaining the value. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `value` | `EvaluationScalar` | The scalar result of the evaluation (boolean, integer, float, or string). | *required* | | `reason` | `str | None` | An optional explanation of the evaluation result. | `None` | Source code in `pydantic_evals/pydantic_evals/evaluators/evaluator.py` ```python @dataclass class EvaluationReason: """The result of running an evaluator with an optional explanation. Contains a scalar value and an optional "reason" explaining the value. Args: value: The scalar result of the evaluation (boolean, integer, float, or string). reason: An optional explanation of the evaluation result. """ value: EvaluationScalar reason: str | None = None ``` ### EvaluationResult Bases: `Generic[EvaluationScalarT]` The details of an individual evaluation result. Contains the name, value, reason, and source evaluator for a single evaluation. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `name` | `str` | The name of the evaluation. | *required* | | `value` | `EvaluationScalarT` | The scalar result of the evaluation. | *required* | | `reason` | `str | None` | An optional explanation of the evaluation result. | *required* | | `source` | `EvaluatorSpec` | The spec of the evaluator that produced this result. | *required* | Source code in `pydantic_evals/pydantic_evals/evaluators/evaluator.py` ```python @dataclass class EvaluationResult(Generic[EvaluationScalarT]): """The details of an individual evaluation result. Contains the name, value, reason, and source evaluator for a single evaluation. Args: name: The name of the evaluation. value: The scalar result of the evaluation. reason: An optional explanation of the evaluation result. source: The spec of the evaluator that produced this result. """ name: str value: EvaluationScalarT reason: str | None source: EvaluatorSpec def downcast(self, *value_types: type[T]) -> EvaluationResult[T] | None: """Attempt to downcast this result to a more specific type. Args: *value_types: The types to check the value against. Returns: A downcast version of this result if the value is an instance of one of the given types, otherwise None. """ # Check if value matches any of the target types, handling bool as a special case for value_type in value_types: if isinstance(self.value, value_type): # Only match bool with explicit bool type if isinstance(self.value, bool) and value_type is not bool: continue return cast(EvaluationResult[T], self) return None ``` #### downcast ```python downcast( *value_types: type[T], ) -> EvaluationResult[T] | None ``` Attempt to downcast this result to a more specific type. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `*value_types` | `type[T]` | The types to check the value against. | `()` | Returns: | Type | Description | | --- | --- | | `EvaluationResult[T] | None` | A downcast version of this result if the value is an instance of one of the given types, | | `EvaluationResult[T] | None` | otherwise None. | Source code in `pydantic_evals/pydantic_evals/evaluators/evaluator.py` ```python def downcast(self, *value_types: type[T]) -> EvaluationResult[T] | None: """Attempt to downcast this result to a more specific type. Args: *value_types: The types to check the value against. Returns: A downcast version of this result if the value is an instance of one of the given types, otherwise None. """ # Check if value matches any of the target types, handling bool as a special case for value_type in value_types: if isinstance(self.value, value_type): # Only match bool with explicit bool type if isinstance(self.value, bool) and value_type is not bool: continue return cast(EvaluationResult[T], self) return None ``` ### Evaluator Bases: `Generic[InputsT, OutputT, MetadataT]` Base class for all evaluators. Evaluators can assess the performance of a task in a variety of ways, as a function of the EvaluatorContext. Subclasses must implement the `evaluate` method. Note it can be defined with either `def` or `async def`. Example: ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class ExactMatch(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> bool: return ctx.output == ctx.expected_output ``` Source code in `pydantic_evals/pydantic_evals/evaluators/evaluator.py` ````python @dataclass(repr=False) class Evaluator(Generic[InputsT, OutputT, MetadataT], metaclass=_StrictABCMeta): """Base class for all evaluators. Evaluators can assess the performance of a task in a variety of ways, as a function of the EvaluatorContext. Subclasses must implement the `evaluate` method. Note it can be defined with either `def` or `async def`. Example: ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class ExactMatch(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> bool: return ctx.output == ctx.expected_output ``` """ __pydantic_config__ = ConfigDict(arbitrary_types_allowed=True) @classmethod def get_serialization_name(cls) -> str: """Return the 'name' of this Evaluator to use during serialization. Returns: The name of the Evaluator, which is typically the class name. """ return cls.__name__ @classmethod @deprecated('`name` has been renamed, use `get_serialization_name` instead.') def name(cls) -> str: """`name` has been renamed, use `get_serialization_name` instead.""" return cls.get_serialization_name() def get_default_evaluation_name(self) -> str: """Return the default name to use in reports for the output of this evaluator. By default, if the evaluator has an attribute called `evaluation_name` of type string, that will be used. Otherwise, the serialization name of the evaluator (which is usually the class name) will be used. This can be overridden to get a more descriptive name in evaluation reports, e.g. using instance information. Note that evaluators that return a mapping of results will always use the keys of that mapping as the names of the associated evaluation results. """ evaluation_name = getattr(self, 'evaluation_name', None) if isinstance(evaluation_name, str): # If the evaluator has an attribute `name` of type string, use that return evaluation_name return self.get_serialization_name() @abstractmethod def evaluate( self, ctx: EvaluatorContext[InputsT, OutputT, MetadataT] ) -> EvaluatorOutput | Awaitable[EvaluatorOutput]: # pragma: no cover """Evaluate the task output in the given context. This is the main evaluation method that subclasses must implement. It can be either synchronous or asynchronous, returning either an EvaluatorOutput directly or an Awaitable[EvaluatorOutput]. Args: ctx: The context containing the inputs, outputs, and metadata for evaluation. Returns: The evaluation result, which can be a scalar value, an EvaluationReason, or a mapping of evaluation names to either of those. Can be returned either synchronously or as an awaitable for asynchronous evaluation. """ raise NotImplementedError('You must implement `evaluate`.') def evaluate_sync(self, ctx: EvaluatorContext[InputsT, OutputT, MetadataT]) -> EvaluatorOutput: """Run the evaluator synchronously, handling both sync and async implementations. This method ensures synchronous execution by running any async evaluate implementation to completion using run_until_complete. Args: ctx: The context containing the inputs, outputs, and metadata for evaluation. Returns: The evaluation result, which can be a scalar value, an EvaluationReason, or a mapping of evaluation names to either of those. """ output = self.evaluate(ctx) if inspect.iscoroutine(output): # pragma: no cover return get_event_loop().run_until_complete(output) else: return cast(EvaluatorOutput, output) async def evaluate_async(self, ctx: EvaluatorContext[InputsT, OutputT, MetadataT]) -> EvaluatorOutput: """Run the evaluator asynchronously, handling both sync and async implementations. This method ensures asynchronous execution by properly awaiting any async evaluate implementation. For synchronous implementations, it returns the result directly. Args: ctx: The context containing the inputs, outputs, and metadata for evaluation. Returns: The evaluation result, which can be a scalar value, an EvaluationReason, or a mapping of evaluation names to either of those. """ # Note: If self.evaluate is synchronous, but you need to prevent this from blocking, override this method with: # return await anyio.to_thread.run_sync(self.evaluate, ctx) output = self.evaluate(ctx) if inspect.iscoroutine(output): return await output else: return cast(EvaluatorOutput, output) @model_serializer(mode='plain') def serialize(self, info: SerializationInfo) -> Any: """Serialize this Evaluator to a JSON-serializable form. Returns: A JSON-serializable representation of this evaluator as an EvaluatorSpec. """ return to_jsonable_python( self.as_spec(), context=info.context, serialize_unknown=True, ) def as_spec(self) -> EvaluatorSpec: raw_arguments = self.build_serialization_arguments() arguments: None | tuple[Any,] | dict[str, Any] if len(raw_arguments) == 0: arguments = None elif len(raw_arguments) == 1: arguments = (next(iter(raw_arguments.values())),) else: arguments = raw_arguments return EvaluatorSpec(name=self.get_serialization_name(), arguments=arguments) def build_serialization_arguments(self) -> dict[str, Any]: """Build the arguments for serialization. Evaluators are serialized for inclusion as the "source" in an `EvaluationResult`. If you want to modify how the evaluator is serialized for that or other purposes, you can override this method. Returns: A dictionary of arguments to be used during serialization. """ raw_arguments: dict[str, Any] = {} for field in fields(self): value = getattr(self, field.name) # always exclude defaults: if field.default is not MISSING: if value == field.default: continue if field.default_factory is not MISSING: if value == field.default_factory(): # pragma: no branch continue raw_arguments[field.name] = value return raw_arguments __repr__ = _utils.dataclasses_no_defaults_repr ```` #### get_serialization_name ```python get_serialization_name() -> str ``` Return the 'name' of this Evaluator to use during serialization. Returns: | Type | Description | | --- | --- | | `str` | The name of the Evaluator, which is typically the class name. | Source code in `pydantic_evals/pydantic_evals/evaluators/evaluator.py` ```python @classmethod def get_serialization_name(cls) -> str: """Return the 'name' of this Evaluator to use during serialization. Returns: The name of the Evaluator, which is typically the class name. """ return cls.__name__ ``` #### name ```python name() -> str ``` Deprecated `name` has been renamed, use `get_serialization_name` instead. `name` has been renamed, use `get_serialization_name` instead. Source code in `pydantic_evals/pydantic_evals/evaluators/evaluator.py` ```python @classmethod @deprecated('`name` has been renamed, use `get_serialization_name` instead.') def name(cls) -> str: """`name` has been renamed, use `get_serialization_name` instead.""" return cls.get_serialization_name() ``` #### get_default_evaluation_name ```python get_default_evaluation_name() -> str ``` Return the default name to use in reports for the output of this evaluator. By default, if the evaluator has an attribute called `evaluation_name` of type string, that will be used. Otherwise, the serialization name of the evaluator (which is usually the class name) will be used. This can be overridden to get a more descriptive name in evaluation reports, e.g. using instance information. Note that evaluators that return a mapping of results will always use the keys of that mapping as the names of the associated evaluation results. Source code in `pydantic_evals/pydantic_evals/evaluators/evaluator.py` ```python def get_default_evaluation_name(self) -> str: """Return the default name to use in reports for the output of this evaluator. By default, if the evaluator has an attribute called `evaluation_name` of type string, that will be used. Otherwise, the serialization name of the evaluator (which is usually the class name) will be used. This can be overridden to get a more descriptive name in evaluation reports, e.g. using instance information. Note that evaluators that return a mapping of results will always use the keys of that mapping as the names of the associated evaluation results. """ evaluation_name = getattr(self, 'evaluation_name', None) if isinstance(evaluation_name, str): # If the evaluator has an attribute `name` of type string, use that return evaluation_name return self.get_serialization_name() ``` #### evaluate ```python evaluate( ctx: EvaluatorContext[InputsT, OutputT, MetadataT], ) -> EvaluatorOutput | Awaitable[EvaluatorOutput] ``` Evaluate the task output in the given context. This is the main evaluation method that subclasses must implement. It can be either synchronous or asynchronous, returning either an EvaluatorOutput directly or an Awaitable[EvaluatorOutput]. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `ctx` | `EvaluatorContext[InputsT, OutputT, MetadataT]` | The context containing the inputs, outputs, and metadata for evaluation. | *required* | Returns: | Type | Description | | --- | --- | | `EvaluatorOutput | Awaitable[EvaluatorOutput]` | The evaluation result, which can be a scalar value, an EvaluationReason, or a mapping | | `EvaluatorOutput | Awaitable[EvaluatorOutput]` | of evaluation names to either of those. Can be returned either synchronously or as an | | `EvaluatorOutput | Awaitable[EvaluatorOutput]` | awaitable for asynchronous evaluation. | Source code in `pydantic_evals/pydantic_evals/evaluators/evaluator.py` ```python @abstractmethod def evaluate( self, ctx: EvaluatorContext[InputsT, OutputT, MetadataT] ) -> EvaluatorOutput | Awaitable[EvaluatorOutput]: # pragma: no cover """Evaluate the task output in the given context. This is the main evaluation method that subclasses must implement. It can be either synchronous or asynchronous, returning either an EvaluatorOutput directly or an Awaitable[EvaluatorOutput]. Args: ctx: The context containing the inputs, outputs, and metadata for evaluation. Returns: The evaluation result, which can be a scalar value, an EvaluationReason, or a mapping of evaluation names to either of those. Can be returned either synchronously or as an awaitable for asynchronous evaluation. """ raise NotImplementedError('You must implement `evaluate`.') ``` #### evaluate_sync ```python evaluate_sync( ctx: EvaluatorContext[InputsT, OutputT, MetadataT], ) -> EvaluatorOutput ``` Run the evaluator synchronously, handling both sync and async implementations. This method ensures synchronous execution by running any async evaluate implementation to completion using run_until_complete. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `ctx` | `EvaluatorContext[InputsT, OutputT, MetadataT]` | The context containing the inputs, outputs, and metadata for evaluation. | *required* | Returns: | Type | Description | | --- | --- | | `EvaluatorOutput` | The evaluation result, which can be a scalar value, an EvaluationReason, or a mapping | | `EvaluatorOutput` | of evaluation names to either of those. | Source code in `pydantic_evals/pydantic_evals/evaluators/evaluator.py` ```python def evaluate_sync(self, ctx: EvaluatorContext[InputsT, OutputT, MetadataT]) -> EvaluatorOutput: """Run the evaluator synchronously, handling both sync and async implementations. This method ensures synchronous execution by running any async evaluate implementation to completion using run_until_complete. Args: ctx: The context containing the inputs, outputs, and metadata for evaluation. Returns: The evaluation result, which can be a scalar value, an EvaluationReason, or a mapping of evaluation names to either of those. """ output = self.evaluate(ctx) if inspect.iscoroutine(output): # pragma: no cover return get_event_loop().run_until_complete(output) else: return cast(EvaluatorOutput, output) ``` #### evaluate_async ```python evaluate_async( ctx: EvaluatorContext[InputsT, OutputT, MetadataT], ) -> EvaluatorOutput ``` Run the evaluator asynchronously, handling both sync and async implementations. This method ensures asynchronous execution by properly awaiting any async evaluate implementation. For synchronous implementations, it returns the result directly. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `ctx` | `EvaluatorContext[InputsT, OutputT, MetadataT]` | The context containing the inputs, outputs, and metadata for evaluation. | *required* | Returns: | Type | Description | | --- | --- | | `EvaluatorOutput` | The evaluation result, which can be a scalar value, an EvaluationReason, or a mapping | | `EvaluatorOutput` | of evaluation names to either of those. | Source code in `pydantic_evals/pydantic_evals/evaluators/evaluator.py` ```python async def evaluate_async(self, ctx: EvaluatorContext[InputsT, OutputT, MetadataT]) -> EvaluatorOutput: """Run the evaluator asynchronously, handling both sync and async implementations. This method ensures asynchronous execution by properly awaiting any async evaluate implementation. For synchronous implementations, it returns the result directly. Args: ctx: The context containing the inputs, outputs, and metadata for evaluation. Returns: The evaluation result, which can be a scalar value, an EvaluationReason, or a mapping of evaluation names to either of those. """ # Note: If self.evaluate is synchronous, but you need to prevent this from blocking, override this method with: # return await anyio.to_thread.run_sync(self.evaluate, ctx) output = self.evaluate(ctx) if inspect.iscoroutine(output): return await output else: return cast(EvaluatorOutput, output) ``` #### serialize ```python serialize(info: SerializationInfo) -> Any ``` Serialize this Evaluator to a JSON-serializable form. Returns: | Type | Description | | --- | --- | | `Any` | A JSON-serializable representation of this evaluator as an EvaluatorSpec. | Source code in `pydantic_evals/pydantic_evals/evaluators/evaluator.py` ```python @model_serializer(mode='plain') def serialize(self, info: SerializationInfo) -> Any: """Serialize this Evaluator to a JSON-serializable form. Returns: A JSON-serializable representation of this evaluator as an EvaluatorSpec. """ return to_jsonable_python( self.as_spec(), context=info.context, serialize_unknown=True, ) ``` #### build_serialization_arguments ```python build_serialization_arguments() -> dict[str, Any] ``` Build the arguments for serialization. Evaluators are serialized for inclusion as the "source" in an `EvaluationResult`. If you want to modify how the evaluator is serialized for that or other purposes, you can override this method. Returns: | Type | Description | | --- | --- | | `dict[str, Any]` | A dictionary of arguments to be used during serialization. | Source code in `pydantic_evals/pydantic_evals/evaluators/evaluator.py` ```python def build_serialization_arguments(self) -> dict[str, Any]: """Build the arguments for serialization. Evaluators are serialized for inclusion as the "source" in an `EvaluationResult`. If you want to modify how the evaluator is serialized for that or other purposes, you can override this method. Returns: A dictionary of arguments to be used during serialization. """ raw_arguments: dict[str, Any] = {} for field in fields(self): value = getattr(self, field.name) # always exclude defaults: if field.default is not MISSING: if value == field.default: continue if field.default_factory is not MISSING: if value == field.default_factory(): # pragma: no branch continue raw_arguments[field.name] = value return raw_arguments ``` ### EvaluatorFailure Represents a failure raised during the execution of an evaluator. Source code in `pydantic_evals/pydantic_evals/evaluators/evaluator.py` ```python @dataclass class EvaluatorFailure: """Represents a failure raised during the execution of an evaluator.""" name: str error_message: str error_stacktrace: str source: EvaluatorSpec ``` ### EvaluatorOutput ```python EvaluatorOutput = ( EvaluationScalar | EvaluationReason | Mapping[str, EvaluationScalar | EvaluationReason] ) ``` Type for the output of an evaluator, which can be a scalar, an EvaluationReason, or a mapping of names to either. ### EvaluatorSpec Bases: `BaseModel` The specification of an evaluator to be run. This class is used to represent evaluators in a serializable format, supporting various short forms for convenience when defining evaluators in YAML or JSON dataset files. In particular, each of the following forms is supported for specifying an evaluator with name `MyEvaluator`: * `'MyEvaluator'` - Just the (string) name of the Evaluator subclass is used if its `__init__` takes no arguments * `{'MyEvaluator': first_arg}` - A single argument is passed as the first positional argument to `MyEvaluator.__init__` * `{'MyEvaluator': {k1: v1, k2: v2}}` - Multiple kwargs are passed to `MyEvaluator.__init__` Source code in `pydantic_evals/pydantic_evals/evaluators/spec.py` ```python class EvaluatorSpec(BaseModel): """The specification of an evaluator to be run. This class is used to represent evaluators in a serializable format, supporting various short forms for convenience when defining evaluators in YAML or JSON dataset files. In particular, each of the following forms is supported for specifying an evaluator with name `MyEvaluator`: * `'MyEvaluator'` - Just the (string) name of the Evaluator subclass is used if its `__init__` takes no arguments * `{'MyEvaluator': first_arg}` - A single argument is passed as the first positional argument to `MyEvaluator.__init__` * `{'MyEvaluator': {k1: v1, k2: v2}}` - Multiple kwargs are passed to `MyEvaluator.__init__` """ name: str """The name of the evaluator class; should be the value returned by `EvaluatorClass.get_serialization_name()`""" arguments: None | tuple[Any] | dict[str, Any] """The arguments to pass to the evaluator's constructor. Can be None (no arguments), a tuple (a single positional argument), or a dict (keyword arguments). """ @property def args(self) -> tuple[Any, ...]: """Get the positional arguments for the evaluator. Returns: A tuple of positional arguments if arguments is a tuple, otherwise an empty tuple. """ if isinstance(self.arguments, tuple): return self.arguments return () @property def kwargs(self) -> dict[str, Any]: """Get the keyword arguments for the evaluator. Returns: A dictionary of keyword arguments if arguments is a dict, otherwise an empty dict. """ if isinstance(self.arguments, dict): return self.arguments return {} @model_validator(mode='wrap') @classmethod def deserialize(cls, value: Any, handler: ModelWrapValidatorHandler[EvaluatorSpec]) -> EvaluatorSpec: """Deserialize an EvaluatorSpec from various formats. This validator handles the various short forms of evaluator specifications, converting them to a consistent EvaluatorSpec instance. Args: value: The value to deserialize. handler: The validator handler. Returns: The deserialized EvaluatorSpec. Raises: ValidationError: If the value cannot be deserialized. """ try: result = handler(value) return result except ValidationError as exc: try: deserialized = _SerializedEvaluatorSpec.model_validate(value) except ValidationError: raise exc # raise the original error return deserialized.to_evaluator_spec() @model_serializer(mode='wrap') def serialize(self, handler: SerializerFunctionWrapHandler, info: SerializationInfo) -> Any: """Serialize using the appropriate short-form if possible. Returns: The serialized evaluator specification, using the shortest form possible: - Just the name if there are no arguments - {name: first_arg} if there's a single positional argument - {name: {kwargs}} if there are multiple (keyword) arguments """ if isinstance(info.context, dict) and info.context.get('use_short_form'): # pyright: ignore[reportUnknownMemberType] if self.arguments is None: return self.name elif isinstance(self.arguments, tuple): return {self.name: self.arguments[0]} else: return {self.name: self.arguments} else: return handler(self) ``` #### name ```python name: str ``` The name of the evaluator class; should be the value returned by `EvaluatorClass.get_serialization_name()` #### arguments ```python arguments: None | tuple[Any] | dict[str, Any] ``` The arguments to pass to the evaluator's constructor. Can be None (no arguments), a tuple (a single positional argument), or a dict (keyword arguments). #### args ```python args: tuple[Any, ...] ``` Get the positional arguments for the evaluator. Returns: | Type | Description | | --- | --- | | `tuple[Any, ...]` | A tuple of positional arguments if arguments is a tuple, otherwise an empty tuple. | #### kwargs ```python kwargs: dict[str, Any] ``` Get the keyword arguments for the evaluator. Returns: | Type | Description | | --- | --- | | `dict[str, Any]` | A dictionary of keyword arguments if arguments is a dict, otherwise an empty dict. | #### deserialize ```python deserialize( value: Any, handler: ModelWrapValidatorHandler[EvaluatorSpec], ) -> EvaluatorSpec ``` Deserialize an EvaluatorSpec from various formats. This validator handles the various short forms of evaluator specifications, converting them to a consistent EvaluatorSpec instance. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `value` | `Any` | The value to deserialize. | *required* | | `handler` | `ModelWrapValidatorHandler[EvaluatorSpec]` | The validator handler. | *required* | Returns: | Type | Description | | --- | --- | | `EvaluatorSpec` | The deserialized EvaluatorSpec. | Raises: | Type | Description | | --- | --- | | `ValidationError` | If the value cannot be deserialized. | Source code in `pydantic_evals/pydantic_evals/evaluators/spec.py` ```python @model_validator(mode='wrap') @classmethod def deserialize(cls, value: Any, handler: ModelWrapValidatorHandler[EvaluatorSpec]) -> EvaluatorSpec: """Deserialize an EvaluatorSpec from various formats. This validator handles the various short forms of evaluator specifications, converting them to a consistent EvaluatorSpec instance. Args: value: The value to deserialize. handler: The validator handler. Returns: The deserialized EvaluatorSpec. Raises: ValidationError: If the value cannot be deserialized. """ try: result = handler(value) return result except ValidationError as exc: try: deserialized = _SerializedEvaluatorSpec.model_validate(value) except ValidationError: raise exc # raise the original error return deserialized.to_evaluator_spec() ``` #### serialize ```python serialize( handler: SerializerFunctionWrapHandler, info: SerializationInfo, ) -> Any ``` Serialize using the appropriate short-form if possible. Returns: | Type | Description | | --- | --- | | `Any` | The serialized evaluator specification, using the shortest form possible: | | `Any` | Just the name if there are no arguments | | `Any` | {name: first_arg} if there's a single positional argument | | `Any` | {name: {kwargs}} if there are multiple (keyword) arguments | Source code in `pydantic_evals/pydantic_evals/evaluators/spec.py` ```python @model_serializer(mode='wrap') def serialize(self, handler: SerializerFunctionWrapHandler, info: SerializationInfo) -> Any: """Serialize using the appropriate short-form if possible. Returns: The serialized evaluator specification, using the shortest form possible: - Just the name if there are no arguments - {name: first_arg} if there's a single positional argument - {name: {kwargs}} if there are multiple (keyword) arguments """ if isinstance(info.context, dict) and info.context.get('use_short_form'): # pyright: ignore[reportUnknownMemberType] if self.arguments is None: return self.name elif isinstance(self.arguments, tuple): return {self.name: self.arguments[0]} else: return {self.name: self.arguments} else: return handler(self) ``` ### GradingOutput Bases: `BaseModel` The output of a grading operation. Source code in `pydantic_evals/pydantic_evals/evaluators/llm_as_a_judge.py` ```python class GradingOutput(BaseModel, populate_by_name=True): """The output of a grading operation.""" reason: str pass_: bool = Field(validation_alias='pass', serialization_alias='pass') score: float ``` ### judge_output ```python judge_output( output: Any, rubric: str, model: Model | KnownModelName | None = None, model_settings: ModelSettings | None = None, ) -> GradingOutput ``` Judge the output of a model based on a rubric. If the model is not specified, a default model is used. The default model starts as 'openai:gpt-4o', but this can be changed using the `set_default_judge_model` function. Source code in `pydantic_evals/pydantic_evals/evaluators/llm_as_a_judge.py` ```python async def judge_output( output: Any, rubric: str, model: models.Model | models.KnownModelName | None = None, model_settings: ModelSettings | None = None, ) -> GradingOutput: """Judge the output of a model based on a rubric. If the model is not specified, a default model is used. The default model starts as 'openai:gpt-4o', but this can be changed using the `set_default_judge_model` function. """ user_prompt = _build_prompt(output=output, rubric=rubric) return ( await _judge_output_agent.run(user_prompt, model=model or _default_model, model_settings=model_settings) ).output ``` ### judge_input_output ```python judge_input_output( inputs: Any, output: Any, rubric: str, model: Model | KnownModelName | None = None, model_settings: ModelSettings | None = None, ) -> GradingOutput ``` Judge the output of a model based on the inputs and a rubric. If the model is not specified, a default model is used. The default model starts as 'openai:gpt-4o', but this can be changed using the `set_default_judge_model` function. Source code in `pydantic_evals/pydantic_evals/evaluators/llm_as_a_judge.py` ```python async def judge_input_output( inputs: Any, output: Any, rubric: str, model: models.Model | models.KnownModelName | None = None, model_settings: ModelSettings | None = None, ) -> GradingOutput: """Judge the output of a model based on the inputs and a rubric. If the model is not specified, a default model is used. The default model starts as 'openai:gpt-4o', but this can be changed using the `set_default_judge_model` function. """ user_prompt = _build_prompt(inputs=inputs, output=output, rubric=rubric) return ( await _judge_input_output_agent.run(user_prompt, model=model or _default_model, model_settings=model_settings) ).output ``` ### judge_input_output_expected ```python judge_input_output_expected( inputs: Any, output: Any, expected_output: Any, rubric: str, model: Model | KnownModelName | None = None, model_settings: ModelSettings | None = None, ) -> GradingOutput ``` Judge the output of a model based on the inputs and a rubric. If the model is not specified, a default model is used. The default model starts as 'openai:gpt-4o', but this can be changed using the `set_default_judge_model` function. Source code in `pydantic_evals/pydantic_evals/evaluators/llm_as_a_judge.py` ```python async def judge_input_output_expected( inputs: Any, output: Any, expected_output: Any, rubric: str, model: models.Model | models.KnownModelName | None = None, model_settings: ModelSettings | None = None, ) -> GradingOutput: """Judge the output of a model based on the inputs and a rubric. If the model is not specified, a default model is used. The default model starts as 'openai:gpt-4o', but this can be changed using the `set_default_judge_model` function. """ user_prompt = _build_prompt(inputs=inputs, output=output, rubric=rubric, expected_output=expected_output) return ( await _judge_input_output_expected_agent.run( user_prompt, model=model or _default_model, model_settings=model_settings ) ).output ``` ### judge_output_expected ```python judge_output_expected( output: Any, expected_output: Any, rubric: str, model: Model | KnownModelName | None = None, model_settings: ModelSettings | None = None, ) -> GradingOutput ``` Judge the output of a model based on the expected output, output, and a rubric. If the model is not specified, a default model is used. The default model starts as 'openai:gpt-4o', but this can be changed using the `set_default_judge_model` function. Source code in `pydantic_evals/pydantic_evals/evaluators/llm_as_a_judge.py` ```python async def judge_output_expected( output: Any, expected_output: Any, rubric: str, model: models.Model | models.KnownModelName | None = None, model_settings: ModelSettings | None = None, ) -> GradingOutput: """Judge the output of a model based on the expected output, output, and a rubric. If the model is not specified, a default model is used. The default model starts as 'openai:gpt-4o', but this can be changed using the `set_default_judge_model` function. """ user_prompt = _build_prompt(output=output, rubric=rubric, expected_output=expected_output) return ( await _judge_output_expected_agent.run( user_prompt, model=model or _default_model, model_settings=model_settings ) ).output ``` ### set_default_judge_model ```python set_default_judge_model( model: Model | KnownModelName, ) -> None ``` Set the default model used for judging. This model is used if `None` is passed to the `model` argument of `judge_output` and `judge_input_output`. Source code in `pydantic_evals/pydantic_evals/evaluators/llm_as_a_judge.py` ```python def set_default_judge_model(model: models.Model | models.KnownModelName) -> None: # pragma: no cover """Set the default model used for judging. This model is used if `None` is passed to the `model` argument of `judge_output` and `judge_input_output`. """ global _default_model _default_model = model ``` # `pydantic_evals.generation` Utilities for generating example datasets for pydantic_evals. This module provides functions for generating sample datasets for testing and examples, using LLMs to create realistic test data with proper structure. ### generate_dataset ```python generate_dataset( *, dataset_type: type[ Dataset[InputsT, OutputT, MetadataT] ], path: Path | str | None = None, custom_evaluator_types: Sequence[ type[Evaluator[InputsT, OutputT, MetadataT]] ] = (), model: Model | KnownModelName = "openai:gpt-4o", n_examples: int = 3, extra_instructions: str | None = None ) -> Dataset[InputsT, OutputT, MetadataT] ``` Use an LLM to generate a dataset of test cases, each consisting of input, expected output, and metadata. This function creates a properly structured dataset with the specified input, output, and metadata types. It uses an LLM to attempt to generate realistic test cases that conform to the types' schemas. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `path` | `Path | str | None` | Optional path to save the generated dataset. If provided, the dataset will be saved to this location. | `None` | | `dataset_type` | `type[Dataset[InputsT, OutputT, MetadataT]]` | The type of dataset to generate, with the desired input, output, and metadata types. | *required* | | `custom_evaluator_types` | `Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]]` | Optional sequence of custom evaluator classes to include in the schema. | `()` | | `model` | `Model | KnownModelName` | The Pydantic AI model to use for generation. Defaults to 'gpt-4o'. | `'openai:gpt-4o'` | | `n_examples` | `int` | Number of examples to generate. Defaults to 3. | `3` | | `extra_instructions` | `str | None` | Optional additional instructions to provide to the LLM. | `None` | Returns: | Type | Description | | --- | --- | | `Dataset[InputsT, OutputT, MetadataT]` | A properly structured Dataset object with generated test cases. | Raises: | Type | Description | | --- | --- | | `ValidationError` | If the LLM's response cannot be parsed as a valid dataset. | Source code in `pydantic_evals/pydantic_evals/generation.py` ```python async def generate_dataset( *, dataset_type: type[Dataset[InputsT, OutputT, MetadataT]], path: Path | str | None = None, custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = (), model: models.Model | models.KnownModelName = 'openai:gpt-4o', n_examples: int = 3, extra_instructions: str | None = None, ) -> Dataset[InputsT, OutputT, MetadataT]: """Use an LLM to generate a dataset of test cases, each consisting of input, expected output, and metadata. This function creates a properly structured dataset with the specified input, output, and metadata types. It uses an LLM to attempt to generate realistic test cases that conform to the types' schemas. Args: path: Optional path to save the generated dataset. If provided, the dataset will be saved to this location. dataset_type: The type of dataset to generate, with the desired input, output, and metadata types. custom_evaluator_types: Optional sequence of custom evaluator classes to include in the schema. model: The Pydantic AI model to use for generation. Defaults to 'gpt-4o'. n_examples: Number of examples to generate. Defaults to 3. extra_instructions: Optional additional instructions to provide to the LLM. Returns: A properly structured Dataset object with generated test cases. Raises: ValidationError: If the LLM's response cannot be parsed as a valid dataset. """ output_schema = dataset_type.model_json_schema_with_evaluators(custom_evaluator_types) # TODO(DavidM): Update this once we add better response_format and/or ResultTool support to Pydantic AI agent = Agent( model, system_prompt=( f'Generate an object that is in compliance with this JSON schema:\n{output_schema}\n\n' f'Include {n_examples} example cases.' ' You must not include any characters in your response before the opening { of the JSON object, or after the closing }.' ), output_type=str, retries=1, ) result = await agent.run(extra_instructions or 'Please generate the object.') try: result = dataset_type.from_text(result.output, fmt='json', custom_evaluator_types=custom_evaluator_types) except ValidationError as e: # pragma: no cover print(f'Raw response from model:\n{result.output}') raise e if path is not None: result.to_file(path, custom_evaluator_types=custom_evaluator_types) # pragma: no cover return result ``` # `pydantic_evals.otel` ### SpanNode A node in the span tree; provides references to parents/children for easy traversal and queries. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python @dataclass(repr=False, kw_only=True) class SpanNode: """A node in the span tree; provides references to parents/children for easy traversal and queries.""" name: str trace_id: int span_id: int parent_span_id: int | None start_timestamp: datetime end_timestamp: datetime attributes: dict[str, AttributeValue] @property def duration(self) -> timedelta: """Return the span's duration as a timedelta, or None if start/end not set.""" return self.end_timestamp - self.start_timestamp @property def children(self) -> list[SpanNode]: return list(self.children_by_id.values()) @property def descendants(self) -> list[SpanNode]: """Return all descendants of this node in DFS order.""" return self.find_descendants(lambda _: True) @property def ancestors(self) -> list[SpanNode]: """Return all ancestors of this node.""" return self.find_ancestors(lambda _: True) @property def node_key(self) -> str: return f'{self.trace_id:032x}:{self.span_id:016x}' @property def parent_node_key(self) -> str | None: return None if self.parent_span_id is None else f'{self.trace_id:032x}:{self.parent_span_id:016x}' # ------------------------------------------------------------------------- # Construction # ------------------------------------------------------------------------- def __post_init__(self): self.parent: SpanNode | None = None self.children_by_id: dict[str, SpanNode] = {} @staticmethod def from_readable_span(span: ReadableSpan) -> SpanNode: assert span.context is not None, 'Span has no context' assert span.start_time is not None, 'Span has no start time' assert span.end_time is not None, 'Span has no end time' return SpanNode( name=span.name, trace_id=span.context.trace_id, span_id=span.context.span_id, parent_span_id=span.parent.span_id if span.parent else None, start_timestamp=datetime.fromtimestamp(span.start_time / 1e9, tz=timezone.utc), end_timestamp=datetime.fromtimestamp(span.end_time / 1e9, tz=timezone.utc), attributes=dict(span.attributes or {}), ) def add_child(self, child: SpanNode) -> None: """Attach a child node to this node's list of children.""" assert child.trace_id == self.trace_id, f"traces don't match: {child.trace_id:032x} != {self.trace_id:032x}" assert child.parent_span_id == self.span_id, ( f'parent span mismatch: {child.parent_span_id:016x} != {self.span_id:016x}' ) self.children_by_id[child.node_key] = child child.parent = self # ------------------------------------------------------------------------- # Child queries # ------------------------------------------------------------------------- def find_children(self, predicate: SpanQuery | SpanPredicate) -> list[SpanNode]: """Return all immediate children that satisfy the given predicate.""" return list(self._filter_children(predicate)) def first_child(self, predicate: SpanQuery | SpanPredicate) -> SpanNode | None: """Return the first immediate child that satisfies the given predicate, or None if none match.""" return next(self._filter_children(predicate), None) def any_child(self, predicate: SpanQuery | SpanPredicate) -> bool: """Returns True if there is at least one child that satisfies the predicate.""" return self.first_child(predicate) is not None def _filter_children(self, predicate: SpanQuery | SpanPredicate) -> Iterator[SpanNode]: return (child for child in self.children if child.matches(predicate)) # ------------------------------------------------------------------------- # Descendant queries (DFS) # ------------------------------------------------------------------------- def find_descendants( self, predicate: SpanQuery | SpanPredicate, stop_recursing_when: SpanQuery | SpanPredicate | None = None ) -> list[SpanNode]: """Return all descendant nodes that satisfy the given predicate in DFS order.""" return list(self._filter_descendants(predicate, stop_recursing_when)) def first_descendant( self, predicate: SpanQuery | SpanPredicate, stop_recursing_when: SpanQuery | SpanPredicate | None = None ) -> SpanNode | None: """DFS: Return the first descendant (in DFS order) that satisfies the given predicate, or `None` if none match.""" return next(self._filter_descendants(predicate, stop_recursing_when), None) def any_descendant( self, predicate: SpanQuery | SpanPredicate, stop_recursing_when: SpanQuery | SpanPredicate | None = None ) -> bool: """Returns `True` if there is at least one descendant that satisfies the predicate.""" return self.first_descendant(predicate, stop_recursing_when) is not None def _filter_descendants( self, predicate: SpanQuery | SpanPredicate, stop_recursing_when: SpanQuery | SpanPredicate | None ) -> Iterator[SpanNode]: stack = list(self.children) while stack: node = stack.pop() if node.matches(predicate): yield node if stop_recursing_when is not None and node.matches(stop_recursing_when): continue stack.extend(node.children) # ------------------------------------------------------------------------- # Ancestor queries (DFS "up" the chain) # ------------------------------------------------------------------------- def find_ancestors( self, predicate: SpanQuery | SpanPredicate, stop_recursing_when: SpanQuery | SpanPredicate | None = None ) -> list[SpanNode]: """Return all ancestors that satisfy the given predicate.""" return list(self._filter_ancestors(predicate, stop_recursing_when)) def first_ancestor( self, predicate: SpanQuery | SpanPredicate, stop_recursing_when: SpanQuery | SpanPredicate | None = None ) -> SpanNode | None: """Return the closest ancestor that satisfies the given predicate, or `None` if none match.""" return next(self._filter_ancestors(predicate, stop_recursing_when), None) def any_ancestor( self, predicate: SpanQuery | SpanPredicate, stop_recursing_when: SpanQuery | SpanPredicate | None = None ) -> bool: """Returns True if any ancestor satisfies the predicate.""" return self.first_ancestor(predicate, stop_recursing_when) is not None def _filter_ancestors( self, predicate: SpanQuery | SpanPredicate, stop_recursing_when: SpanQuery | SpanPredicate | None ) -> Iterator[SpanNode]: node = self.parent while node: if node.matches(predicate): yield node if stop_recursing_when is not None and node.matches(stop_recursing_when): break node = node.parent # ------------------------------------------------------------------------- # Query matching # ------------------------------------------------------------------------- def matches(self, query: SpanQuery | SpanPredicate) -> bool: """Check if the span node matches the query conditions or predicate.""" if callable(query): return query(self) return self._matches_query(query) def _matches_query(self, query: SpanQuery) -> bool: # noqa C901 """Check if the span matches the query conditions.""" # Logical combinations if or_ := query.get('or_'): if len(query) > 1: raise ValueError("Cannot combine 'or_' conditions with other conditions at the same level") return any(self._matches_query(q) for q in or_) if not_ := query.get('not_'): if self._matches_query(not_): return False if and_ := query.get('and_'): results = [self._matches_query(q) for q in and_] if not all(results): return False # At this point, all existing ANDs and no existing ORs have passed, so it comes down to this condition # Name conditions if (name_equals := query.get('name_equals')) and self.name != name_equals: return False if (name_contains := query.get('name_contains')) and name_contains not in self.name: return False if (name_matches_regex := query.get('name_matches_regex')) and not re.match(name_matches_regex, self.name): return False # Attribute conditions if (has_attributes := query.get('has_attributes')) and not all( self.attributes.get(key) == value for key, value in has_attributes.items() ): return False if (has_attributes_keys := query.get('has_attribute_keys')) and not all( key in self.attributes for key in has_attributes_keys ): return False # Timing conditions if (min_duration := query.get('min_duration')) is not None: if not isinstance(min_duration, timedelta): min_duration = timedelta(seconds=min_duration) if self.duration < min_duration: return False if (max_duration := query.get('max_duration')) is not None: if not isinstance(max_duration, timedelta): max_duration = timedelta(seconds=max_duration) if self.duration > max_duration: return False # Children conditions if (min_child_count := query.get('min_child_count')) and len(self.children) < min_child_count: return False if (max_child_count := query.get('max_child_count')) and len(self.children) > max_child_count: return False if (some_child_has := query.get('some_child_has')) and not any( child._matches_query(some_child_has) for child in self.children ): return False if (all_children_have := query.get('all_children_have')) and not all( child._matches_query(all_children_have) for child in self.children ): return False if (no_child_has := query.get('no_child_has')) and any( child._matches_query(no_child_has) for child in self.children ): return False # Descendant conditions # The following local functions with cache decorators are used to avoid repeatedly evaluating these properties @cache def descendants(): return self.descendants @cache def pruned_descendants(): stop_recursing_when = query.get('stop_recursing_when') return ( self._filter_descendants(lambda _: True, stop_recursing_when) if stop_recursing_when else descendants() ) if (min_descendant_count := query.get('min_descendant_count')) and len(descendants()) < min_descendant_count: return False if (max_descendant_count := query.get('max_descendant_count')) and len(descendants()) > max_descendant_count: return False if (some_descendant_has := query.get('some_descendant_has')) and not any( descendant._matches_query(some_descendant_has) for descendant in pruned_descendants() ): return False if (all_descendants_have := query.get('all_descendants_have')) and not all( descendant._matches_query(all_descendants_have) for descendant in pruned_descendants() ): return False if (no_descendant_has := query.get('no_descendant_has')) and any( descendant._matches_query(no_descendant_has) for descendant in pruned_descendants() ): return False # Ancestor conditions # The following local functions with cache decorators are used to avoid repeatedly evaluating these properties @cache def ancestors(): return self.ancestors @cache def pruned_ancestors(): stop_recursing_when = query.get('stop_recursing_when') return self._filter_ancestors(lambda _: True, stop_recursing_when) if stop_recursing_when else ancestors() if (min_depth := query.get('min_depth')) and len(ancestors()) < min_depth: return False if (max_depth := query.get('max_depth')) and len(ancestors()) > max_depth: return False if (some_ancestor_has := query.get('some_ancestor_has')) and not any( ancestor._matches_query(some_ancestor_has) for ancestor in pruned_ancestors() ): return False if (all_ancestors_have := query.get('all_ancestors_have')) and not all( ancestor._matches_query(all_ancestors_have) for ancestor in pruned_ancestors() ): return False if (no_ancestor_has := query.get('no_ancestor_has')) and any( ancestor._matches_query(no_ancestor_has) for ancestor in pruned_ancestors() ): return False return True # ------------------------------------------------------------------------- # String representation # ------------------------------------------------------------------------- def repr_xml( self, include_children: bool = True, include_trace_id: bool = False, include_span_id: bool = False, include_start_timestamp: bool = False, include_duration: bool = False, ) -> str: """Return an XML-like string representation of the node. Optionally includes children, trace_id, span_id, start_timestamp, and duration. """ first_line_parts = [f'') for child in self.children: extra_lines.append( indent( child.repr_xml( include_children=include_children, include_trace_id=include_trace_id, include_span_id=include_span_id, include_start_timestamp=include_start_timestamp, include_duration=include_duration, ), ' ', ) ) extra_lines.append('') else: if self.children: first_line_parts.append('children=...') first_line_parts.append('/>') return '\n'.join([' '.join(first_line_parts), *extra_lines]) def __str__(self) -> str: if self.children: return f"..." else: return f"" def __repr__(self) -> str: return self.repr_xml() ``` #### duration ```python duration: timedelta ``` Return the span's duration as a timedelta, or None if start/end not set. #### descendants ```python descendants: list[SpanNode] ``` Return all descendants of this node in DFS order. #### ancestors ```python ancestors: list[SpanNode] ``` Return all ancestors of this node. #### add_child ```python add_child(child: SpanNode) -> None ``` Attach a child node to this node's list of children. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def add_child(self, child: SpanNode) -> None: """Attach a child node to this node's list of children.""" assert child.trace_id == self.trace_id, f"traces don't match: {child.trace_id:032x} != {self.trace_id:032x}" assert child.parent_span_id == self.span_id, ( f'parent span mismatch: {child.parent_span_id:016x} != {self.span_id:016x}' ) self.children_by_id[child.node_key] = child child.parent = self ``` #### find_children ```python find_children( predicate: SpanQuery | SpanPredicate, ) -> list[SpanNode] ``` Return all immediate children that satisfy the given predicate. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def find_children(self, predicate: SpanQuery | SpanPredicate) -> list[SpanNode]: """Return all immediate children that satisfy the given predicate.""" return list(self._filter_children(predicate)) ``` #### first_child ```python first_child( predicate: SpanQuery | SpanPredicate, ) -> SpanNode | None ``` Return the first immediate child that satisfies the given predicate, or None if none match. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def first_child(self, predicate: SpanQuery | SpanPredicate) -> SpanNode | None: """Return the first immediate child that satisfies the given predicate, or None if none match.""" return next(self._filter_children(predicate), None) ``` #### any_child ```python any_child(predicate: SpanQuery | SpanPredicate) -> bool ``` Returns True if there is at least one child that satisfies the predicate. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def any_child(self, predicate: SpanQuery | SpanPredicate) -> bool: """Returns True if there is at least one child that satisfies the predicate.""" return self.first_child(predicate) is not None ``` #### find_descendants ```python find_descendants( predicate: SpanQuery | SpanPredicate, stop_recursing_when: ( SpanQuery | SpanPredicate | None ) = None, ) -> list[SpanNode] ``` Return all descendant nodes that satisfy the given predicate in DFS order. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def find_descendants( self, predicate: SpanQuery | SpanPredicate, stop_recursing_when: SpanQuery | SpanPredicate | None = None ) -> list[SpanNode]: """Return all descendant nodes that satisfy the given predicate in DFS order.""" return list(self._filter_descendants(predicate, stop_recursing_when)) ``` #### first_descendant ```python first_descendant( predicate: SpanQuery | SpanPredicate, stop_recursing_when: ( SpanQuery | SpanPredicate | None ) = None, ) -> SpanNode | None ``` DFS: Return the first descendant (in DFS order) that satisfies the given predicate, or `None` if none match. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def first_descendant( self, predicate: SpanQuery | SpanPredicate, stop_recursing_when: SpanQuery | SpanPredicate | None = None ) -> SpanNode | None: """DFS: Return the first descendant (in DFS order) that satisfies the given predicate, or `None` if none match.""" return next(self._filter_descendants(predicate, stop_recursing_when), None) ``` #### any_descendant ```python any_descendant( predicate: SpanQuery | SpanPredicate, stop_recursing_when: ( SpanQuery | SpanPredicate | None ) = None, ) -> bool ``` Returns `True` if there is at least one descendant that satisfies the predicate. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def any_descendant( self, predicate: SpanQuery | SpanPredicate, stop_recursing_when: SpanQuery | SpanPredicate | None = None ) -> bool: """Returns `True` if there is at least one descendant that satisfies the predicate.""" return self.first_descendant(predicate, stop_recursing_when) is not None ``` #### find_ancestors ```python find_ancestors( predicate: SpanQuery | SpanPredicate, stop_recursing_when: ( SpanQuery | SpanPredicate | None ) = None, ) -> list[SpanNode] ``` Return all ancestors that satisfy the given predicate. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def find_ancestors( self, predicate: SpanQuery | SpanPredicate, stop_recursing_when: SpanQuery | SpanPredicate | None = None ) -> list[SpanNode]: """Return all ancestors that satisfy the given predicate.""" return list(self._filter_ancestors(predicate, stop_recursing_when)) ``` #### first_ancestor ```python first_ancestor( predicate: SpanQuery | SpanPredicate, stop_recursing_when: ( SpanQuery | SpanPredicate | None ) = None, ) -> SpanNode | None ``` Return the closest ancestor that satisfies the given predicate, or `None` if none match. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def first_ancestor( self, predicate: SpanQuery | SpanPredicate, stop_recursing_when: SpanQuery | SpanPredicate | None = None ) -> SpanNode | None: """Return the closest ancestor that satisfies the given predicate, or `None` if none match.""" return next(self._filter_ancestors(predicate, stop_recursing_when), None) ``` #### any_ancestor ```python any_ancestor( predicate: SpanQuery | SpanPredicate, stop_recursing_when: ( SpanQuery | SpanPredicate | None ) = None, ) -> bool ``` Returns True if any ancestor satisfies the predicate. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def any_ancestor( self, predicate: SpanQuery | SpanPredicate, stop_recursing_when: SpanQuery | SpanPredicate | None = None ) -> bool: """Returns True if any ancestor satisfies the predicate.""" return self.first_ancestor(predicate, stop_recursing_when) is not None ``` #### matches ```python matches(query: SpanQuery | SpanPredicate) -> bool ``` Check if the span node matches the query conditions or predicate. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def matches(self, query: SpanQuery | SpanPredicate) -> bool: """Check if the span node matches the query conditions or predicate.""" if callable(query): return query(self) return self._matches_query(query) ``` #### repr_xml ```python repr_xml( include_children: bool = True, include_trace_id: bool = False, include_span_id: bool = False, include_start_timestamp: bool = False, include_duration: bool = False, ) -> str ``` Return an XML-like string representation of the node. Optionally includes children, trace_id, span_id, start_timestamp, and duration. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def repr_xml( self, include_children: bool = True, include_trace_id: bool = False, include_span_id: bool = False, include_start_timestamp: bool = False, include_duration: bool = False, ) -> str: """Return an XML-like string representation of the node. Optionally includes children, trace_id, span_id, start_timestamp, and duration. """ first_line_parts = [f'') for child in self.children: extra_lines.append( indent( child.repr_xml( include_children=include_children, include_trace_id=include_trace_id, include_span_id=include_span_id, include_start_timestamp=include_start_timestamp, include_duration=include_duration, ), ' ', ) ) extra_lines.append('') else: if self.children: first_line_parts.append('children=...') first_line_parts.append('/>') return '\n'.join([' '.join(first_line_parts), *extra_lines]) ``` ### SpanQuery Bases: `TypedDict` A serializable query for filtering SpanNodes based on various conditions. All fields are optional and combined with AND logic by default. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python class SpanQuery(TypedDict, total=False): """A serializable query for filtering SpanNodes based on various conditions. All fields are optional and combined with AND logic by default. """ # These fields are ordered to match the implementation of SpanNode.matches_query for easy review. # * Individual span conditions come first because these are generally the cheapest to evaluate # * Logical combinations come next because they may just be combinations of individual span conditions # * Related-span conditions come last because they may require the most work to evaluate # Individual span conditions ## Name conditions name_equals: str name_contains: str name_matches_regex: str # regex pattern ## Attribute conditions has_attributes: dict[str, Any] has_attribute_keys: list[str] ## Timing conditions min_duration: timedelta | float max_duration: timedelta | float # Logical combinations of conditions not_: SpanQuery and_: list[SpanQuery] or_: list[SpanQuery] # Child conditions min_child_count: int max_child_count: int some_child_has: SpanQuery all_children_have: SpanQuery no_child_has: SpanQuery # Recursive conditions stop_recursing_when: SpanQuery """If present, stop recursing through ancestors or descendants at nodes that match this condition.""" ## Descendant conditions min_descendant_count: int max_descendant_count: int some_descendant_has: SpanQuery all_descendants_have: SpanQuery no_descendant_has: SpanQuery ## Ancestor conditions min_depth: int # depth is equivalent to ancestor count; roots have depth 0 max_depth: int some_ancestor_has: SpanQuery all_ancestors_have: SpanQuery no_ancestor_has: SpanQuery ``` #### stop_recursing_when ```python stop_recursing_when: SpanQuery ``` If present, stop recursing through ancestors or descendants at nodes that match this condition. ### SpanTree A container that builds a hierarchy of SpanNode objects from a list of finished spans. You can then search or iterate the tree to make your assertions (using DFS for traversal). Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python @dataclass(repr=False, kw_only=True) class SpanTree: """A container that builds a hierarchy of SpanNode objects from a list of finished spans. You can then search or iterate the tree to make your assertions (using DFS for traversal). """ roots: list[SpanNode] = field(default_factory=list) nodes_by_id: dict[str, SpanNode] = field(default_factory=dict) # ------------------------------------------------------------------------- # Construction # ------------------------------------------------------------------------- def __post_init__(self): self._rebuild_tree() def add_spans(self, spans: list[SpanNode]) -> None: """Add a list of spans to the tree, rebuilding the tree structure.""" for span in spans: self.nodes_by_id[span.node_key] = span self._rebuild_tree() def add_readable_spans(self, readable_spans: list[ReadableSpan]): self.add_spans([SpanNode.from_readable_span(span) for span in readable_spans]) def _rebuild_tree(self): # Ensure spans are ordered by start_timestamp so that roots and children end up in the right order nodes = list(self.nodes_by_id.values()) nodes.sort(key=lambda node: node.start_timestamp or datetime.min) self.nodes_by_id = {node.node_key: node for node in nodes} # Build the parent/child relationships for node in self.nodes_by_id.values(): parent_node_key = node.parent_node_key if parent_node_key is not None: parent_node = self.nodes_by_id.get(parent_node_key) if parent_node is not None: parent_node.add_child(node) # Determine the roots # A node is a "root" if its parent is None or if its parent's span_id is not in the current set of spans. self.roots = [] for node in self.nodes_by_id.values(): parent_node_key = node.parent_node_key if parent_node_key is None or parent_node_key not in self.nodes_by_id: self.roots.append(node) # ------------------------------------------------------------------------- # Node filtering and iteration # ------------------------------------------------------------------------- def find(self, predicate: SpanQuery | SpanPredicate) -> list[SpanNode]: """Find all nodes in the entire tree that match the predicate, scanning from each root in DFS order.""" return list(self._filter(predicate)) def first(self, predicate: SpanQuery | SpanPredicate) -> SpanNode | None: """Find the first node that matches a predicate, scanning from each root in DFS order. Returns `None` if not found.""" return next(self._filter(predicate), None) def any(self, predicate: SpanQuery | SpanPredicate) -> bool: """Returns True if any node in the tree matches the predicate.""" return self.first(predicate) is not None def _filter(self, predicate: SpanQuery | SpanPredicate) -> Iterator[SpanNode]: for node in self: if node.matches(predicate): yield node def __iter__(self) -> Iterator[SpanNode]: """Return an iterator over all nodes in the tree.""" return iter(self.nodes_by_id.values()) # ------------------------------------------------------------------------- # String representation # ------------------------------------------------------------------------- def repr_xml( self, include_children: bool = True, include_trace_id: bool = False, include_span_id: bool = False, include_start_timestamp: bool = False, include_duration: bool = False, ) -> str: """Return an XML-like string representation of the tree, optionally including children, trace_id, span_id, duration, and timestamps.""" if not self.roots: return '' repr_parts = [ '', *[ indent( root.repr_xml( include_children=include_children, include_trace_id=include_trace_id, include_span_id=include_span_id, include_start_timestamp=include_start_timestamp, include_duration=include_duration, ), ' ', ) for root in self.roots ], '', ] return '\n'.join(repr_parts) def __str__(self): return f'' def __repr__(self): return self.repr_xml() ``` #### add_spans ```python add_spans(spans: list[SpanNode]) -> None ``` Add a list of spans to the tree, rebuilding the tree structure. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def add_spans(self, spans: list[SpanNode]) -> None: """Add a list of spans to the tree, rebuilding the tree structure.""" for span in spans: self.nodes_by_id[span.node_key] = span self._rebuild_tree() ``` #### find ```python find( predicate: SpanQuery | SpanPredicate, ) -> list[SpanNode] ``` Find all nodes in the entire tree that match the predicate, scanning from each root in DFS order. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def find(self, predicate: SpanQuery | SpanPredicate) -> list[SpanNode]: """Find all nodes in the entire tree that match the predicate, scanning from each root in DFS order.""" return list(self._filter(predicate)) ``` #### first ```python first( predicate: SpanQuery | SpanPredicate, ) -> SpanNode | None ``` Find the first node that matches a predicate, scanning from each root in DFS order. Returns `None` if not found. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def first(self, predicate: SpanQuery | SpanPredicate) -> SpanNode | None: """Find the first node that matches a predicate, scanning from each root in DFS order. Returns `None` if not found.""" return next(self._filter(predicate), None) ``` #### any ```python any(predicate: SpanQuery | SpanPredicate) -> bool ``` Returns True if any node in the tree matches the predicate. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def any(self, predicate: SpanQuery | SpanPredicate) -> bool: """Returns True if any node in the tree matches the predicate.""" return self.first(predicate) is not None ``` #### __iter__ ```python __iter__() -> Iterator[SpanNode] ``` Return an iterator over all nodes in the tree. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def __iter__(self) -> Iterator[SpanNode]: """Return an iterator over all nodes in the tree.""" return iter(self.nodes_by_id.values()) ``` #### repr_xml ```python repr_xml( include_children: bool = True, include_trace_id: bool = False, include_span_id: bool = False, include_start_timestamp: bool = False, include_duration: bool = False, ) -> str ``` Return an XML-like string representation of the tree, optionally including children, trace_id, span_id, duration, and timestamps. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def repr_xml( self, include_children: bool = True, include_trace_id: bool = False, include_span_id: bool = False, include_start_timestamp: bool = False, include_duration: bool = False, ) -> str: """Return an XML-like string representation of the tree, optionally including children, trace_id, span_id, duration, and timestamps.""" if not self.roots: return '' repr_parts = [ '', *[ indent( root.repr_xml( include_children=include_children, include_trace_id=include_trace_id, include_span_id=include_span_id, include_start_timestamp=include_start_timestamp, include_duration=include_duration, ), ' ', ) for root in self.roots ], '', ] return '\n'.join(repr_parts) ``` # `pydantic_evals.reporting` ### ReportCase Bases: `Generic[InputsT, OutputT, MetadataT]` A single case in an evaluation report. Source code in `pydantic_evals/pydantic_evals/reporting/__init__.py` ```python @dataclass(kw_only=True) class ReportCase(Generic[InputsT, OutputT, MetadataT]): """A single case in an evaluation report.""" name: str """The name of the [case][pydantic_evals.Case].""" inputs: InputsT """The inputs to the task, from [`Case.inputs`][pydantic_evals.Case.inputs].""" metadata: MetadataT | None """Any metadata associated with the case, from [`Case.metadata`][pydantic_evals.Case.metadata].""" expected_output: OutputT | None """The expected output of the task, from [`Case.expected_output`][pydantic_evals.Case.expected_output].""" output: OutputT """The output of the task execution.""" metrics: dict[str, float | int] attributes: dict[str, Any] scores: dict[str, EvaluationResult[int | float]] labels: dict[str, EvaluationResult[str]] assertions: dict[str, EvaluationResult[bool]] task_duration: float total_duration: float # includes evaluator execution time trace_id: str | None = None """The trace ID of the case span.""" span_id: str | None = None """The span ID of the case span.""" evaluator_failures: list[EvaluatorFailure] = field(default_factory=list) ``` #### name ```python name: str ``` The name of the case. #### inputs ```python inputs: InputsT ``` The inputs to the task, from Case.inputs. #### metadata ```python metadata: MetadataT | None ``` Any metadata associated with the case, from Case.metadata. #### expected_output ```python expected_output: OutputT | None ``` The expected output of the task, from Case.expected_output. #### output ```python output: OutputT ``` The output of the task execution. #### trace_id ```python trace_id: str | None = None ``` The trace ID of the case span. #### span_id ```python span_id: str | None = None ``` The span ID of the case span. ### ReportCaseFailure Bases: `Generic[InputsT, OutputT, MetadataT]` A single case in an evaluation report that failed due to an error during task execution. Source code in `pydantic_evals/pydantic_evals/reporting/__init__.py` ```python @dataclass(kw_only=True) class ReportCaseFailure(Generic[InputsT, OutputT, MetadataT]): """A single case in an evaluation report that failed due to an error during task execution.""" name: str """The name of the [case][pydantic_evals.Case].""" inputs: InputsT """The inputs to the task, from [`Case.inputs`][pydantic_evals.Case.inputs].""" metadata: MetadataT | None """Any metadata associated with the case, from [`Case.metadata`][pydantic_evals.Case.metadata].""" expected_output: OutputT | None """The expected output of the task, from [`Case.expected_output`][pydantic_evals.Case.expected_output].""" error_message: str """The message of the exception that caused the failure.""" error_stacktrace: str """The stacktrace of the exception that caused the failure.""" trace_id: str | None = None """The trace ID of the case span.""" span_id: str | None = None """The span ID of the case span.""" ``` #### name ```python name: str ``` The name of the case. #### inputs ```python inputs: InputsT ``` The inputs to the task, from Case.inputs. #### metadata ```python metadata: MetadataT | None ``` Any metadata associated with the case, from Case.metadata. #### expected_output ```python expected_output: OutputT | None ``` The expected output of the task, from Case.expected_output. #### error_message ```python error_message: str ``` The message of the exception that caused the failure. #### error_stacktrace ```python error_stacktrace: str ``` The stacktrace of the exception that caused the failure. #### trace_id ```python trace_id: str | None = None ``` The trace ID of the case span. #### span_id ```python span_id: str | None = None ``` The span ID of the case span. ### ReportCaseAggregate Bases: `BaseModel` A synthetic case that summarizes a set of cases. Source code in `pydantic_evals/pydantic_evals/reporting/__init__.py` ```python class ReportCaseAggregate(BaseModel): """A synthetic case that summarizes a set of cases.""" name: str scores: dict[str, float | int] labels: dict[str, dict[str, float]] metrics: dict[str, float | int] assertions: float | None task_duration: float total_duration: float @staticmethod def average(cases: list[ReportCase]) -> ReportCaseAggregate: """Produce a synthetic "summary" case by averaging quantitative attributes.""" num_cases = len(cases) if num_cases == 0: return ReportCaseAggregate( name='Averages', scores={}, labels={}, metrics={}, assertions=None, task_duration=0.0, total_duration=0.0, ) def _scores_averages(scores_by_name: list[dict[str, int | float | bool]]) -> dict[str, float]: counts_by_name: dict[str, int] = defaultdict(int) sums_by_name: dict[str, float] = defaultdict(float) for sbn in scores_by_name: for name, score in sbn.items(): counts_by_name[name] += 1 sums_by_name[name] += score return {name: sums_by_name[name] / counts_by_name[name] for name in sums_by_name} def _labels_averages(labels_by_name: list[dict[str, str]]) -> dict[str, dict[str, float]]: counts_by_name: dict[str, int] = defaultdict(int) sums_by_name: dict[str, dict[str, float]] = defaultdict(lambda: defaultdict(float)) for lbn in labels_by_name: for name, label in lbn.items(): counts_by_name[name] += 1 sums_by_name[name][label] += 1 return { name: {value: count / counts_by_name[name] for value, count in sums_by_name[name].items()} for name in sums_by_name } average_task_duration = sum(case.task_duration for case in cases) / num_cases average_total_duration = sum(case.total_duration for case in cases) / num_cases # average_assertions: dict[str, float] = _scores_averages([{k: v.value for k, v in case.scores.items()} for case in cases]) average_scores: dict[str, float] = _scores_averages( [{k: v.value for k, v in case.scores.items()} for case in cases] ) average_labels: dict[str, dict[str, float]] = _labels_averages( [{k: v.value for k, v in case.labels.items()} for case in cases] ) average_metrics: dict[str, float] = _scores_averages([case.metrics for case in cases]) average_assertions: float | None = None n_assertions = sum(len(case.assertions) for case in cases) if n_assertions > 0: n_passing = sum(1 for case in cases for assertion in case.assertions.values() if assertion.value) average_assertions = n_passing / n_assertions return ReportCaseAggregate( name='Averages', scores=average_scores, labels=average_labels, metrics=average_metrics, assertions=average_assertions, task_duration=average_task_duration, total_duration=average_total_duration, ) ``` #### average ```python average(cases: list[ReportCase]) -> ReportCaseAggregate ``` Produce a synthetic "summary" case by averaging quantitative attributes. Source code in `pydantic_evals/pydantic_evals/reporting/__init__.py` ```python @staticmethod def average(cases: list[ReportCase]) -> ReportCaseAggregate: """Produce a synthetic "summary" case by averaging quantitative attributes.""" num_cases = len(cases) if num_cases == 0: return ReportCaseAggregate( name='Averages', scores={}, labels={}, metrics={}, assertions=None, task_duration=0.0, total_duration=0.0, ) def _scores_averages(scores_by_name: list[dict[str, int | float | bool]]) -> dict[str, float]: counts_by_name: dict[str, int] = defaultdict(int) sums_by_name: dict[str, float] = defaultdict(float) for sbn in scores_by_name: for name, score in sbn.items(): counts_by_name[name] += 1 sums_by_name[name] += score return {name: sums_by_name[name] / counts_by_name[name] for name in sums_by_name} def _labels_averages(labels_by_name: list[dict[str, str]]) -> dict[str, dict[str, float]]: counts_by_name: dict[str, int] = defaultdict(int) sums_by_name: dict[str, dict[str, float]] = defaultdict(lambda: defaultdict(float)) for lbn in labels_by_name: for name, label in lbn.items(): counts_by_name[name] += 1 sums_by_name[name][label] += 1 return { name: {value: count / counts_by_name[name] for value, count in sums_by_name[name].items()} for name in sums_by_name } average_task_duration = sum(case.task_duration for case in cases) / num_cases average_total_duration = sum(case.total_duration for case in cases) / num_cases # average_assertions: dict[str, float] = _scores_averages([{k: v.value for k, v in case.scores.items()} for case in cases]) average_scores: dict[str, float] = _scores_averages( [{k: v.value for k, v in case.scores.items()} for case in cases] ) average_labels: dict[str, dict[str, float]] = _labels_averages( [{k: v.value for k, v in case.labels.items()} for case in cases] ) average_metrics: dict[str, float] = _scores_averages([case.metrics for case in cases]) average_assertions: float | None = None n_assertions = sum(len(case.assertions) for case in cases) if n_assertions > 0: n_passing = sum(1 for case in cases for assertion in case.assertions.values() if assertion.value) average_assertions = n_passing / n_assertions return ReportCaseAggregate( name='Averages', scores=average_scores, labels=average_labels, metrics=average_metrics, assertions=average_assertions, task_duration=average_task_duration, total_duration=average_total_duration, ) ``` ### EvaluationReport Bases: `Generic[InputsT, OutputT, MetadataT]` A report of the results of evaluating a model on a set of cases. Source code in `pydantic_evals/pydantic_evals/reporting/__init__.py` ```python @dataclass(kw_only=True) class EvaluationReport(Generic[InputsT, OutputT, MetadataT]): """A report of the results of evaluating a model on a set of cases.""" name: str """The name of the report.""" cases: list[ReportCase[InputsT, OutputT, MetadataT]] """The cases in the report.""" failures: list[ReportCaseFailure[InputsT, OutputT, MetadataT]] = field(default_factory=list) """The failures in the report. These are cases where task execution raised an exception.""" trace_id: str | None = None """The trace ID of the evaluation.""" span_id: str | None = None """The span ID of the evaluation.""" def averages(self) -> ReportCaseAggregate | None: if self.cases: return ReportCaseAggregate.average(self.cases) return None def print( self, width: int | None = None, baseline: EvaluationReport[InputsT, OutputT, MetadataT] | None = None, *, include_input: bool = False, include_metadata: bool = False, include_expected_output: bool = False, include_output: bool = False, include_durations: bool = True, include_total_duration: bool = False, include_removed_cases: bool = False, include_averages: bool = True, include_errors: bool = True, include_error_stacktrace: bool = False, include_evaluator_failures: bool = True, input_config: RenderValueConfig | None = None, metadata_config: RenderValueConfig | None = None, output_config: RenderValueConfig | None = None, score_configs: dict[str, RenderNumberConfig] | None = None, label_configs: dict[str, RenderValueConfig] | None = None, metric_configs: dict[str, RenderNumberConfig] | None = None, duration_config: RenderNumberConfig | None = None, include_reasons: bool = False, ): # pragma: no cover """Print this report to the console, optionally comparing it to a baseline report. If you want more control over the output, use `console_table` instead and pass it to `rich.Console.print`. """ table = self.console_table( baseline=baseline, include_input=include_input, include_metadata=include_metadata, include_expected_output=include_expected_output, include_output=include_output, include_durations=include_durations, include_total_duration=include_total_duration, include_removed_cases=include_removed_cases, include_averages=include_averages, include_evaluator_failures=include_evaluator_failures, input_config=input_config, metadata_config=metadata_config, output_config=output_config, score_configs=score_configs, label_configs=label_configs, metric_configs=metric_configs, duration_config=duration_config, include_reasons=include_reasons, ) console = Console(width=width) console.print(table) if include_errors and self.failures: failures_table = self.failures_table( include_input=include_input, include_metadata=include_metadata, include_expected_output=include_expected_output, include_error_message=True, include_error_stacktrace=include_error_stacktrace, input_config=input_config, metadata_config=metadata_config, ) console.print(failures_table, style='red') def console_table( self, baseline: EvaluationReport[InputsT, OutputT, MetadataT] | None = None, *, include_input: bool = False, include_metadata: bool = False, include_expected_output: bool = False, include_output: bool = False, include_durations: bool = True, include_total_duration: bool = False, include_removed_cases: bool = False, include_averages: bool = True, include_evaluator_failures: bool = True, input_config: RenderValueConfig | None = None, metadata_config: RenderValueConfig | None = None, output_config: RenderValueConfig | None = None, score_configs: dict[str, RenderNumberConfig] | None = None, label_configs: dict[str, RenderValueConfig] | None = None, metric_configs: dict[str, RenderNumberConfig] | None = None, duration_config: RenderNumberConfig | None = None, include_reasons: bool = False, ) -> Table: """Return a table containing the data from this report, or the diff between this report and a baseline report. Optionally include input and output details. """ renderer = EvaluationRenderer( include_input=include_input, include_metadata=include_metadata, include_expected_output=include_expected_output, include_output=include_output, include_durations=include_durations, include_total_duration=include_total_duration, include_removed_cases=include_removed_cases, include_averages=include_averages, include_error_message=False, include_error_stacktrace=False, include_evaluator_failures=include_evaluator_failures, input_config={**_DEFAULT_VALUE_CONFIG, **(input_config or {})}, metadata_config={**_DEFAULT_VALUE_CONFIG, **(metadata_config or {})}, output_config=output_config or _DEFAULT_VALUE_CONFIG, score_configs=score_configs or {}, label_configs=label_configs or {}, metric_configs=metric_configs or {}, duration_config=duration_config or _DEFAULT_DURATION_CONFIG, include_reasons=include_reasons, ) if baseline is None: return renderer.build_table(self) else: # pragma: no cover return renderer.build_diff_table(self, baseline) def failures_table( self, *, include_input: bool = False, include_metadata: bool = False, include_expected_output: bool = False, include_error_message: bool = True, include_error_stacktrace: bool = True, input_config: RenderValueConfig | None = None, metadata_config: RenderValueConfig | None = None, ) -> Table: """Return a table containing the failures in this report.""" renderer = EvaluationRenderer( include_input=include_input, include_metadata=include_metadata, include_expected_output=include_expected_output, include_output=False, include_durations=False, include_total_duration=False, include_removed_cases=False, include_averages=False, input_config={**_DEFAULT_VALUE_CONFIG, **(input_config or {})}, metadata_config={**_DEFAULT_VALUE_CONFIG, **(metadata_config or {})}, output_config=_DEFAULT_VALUE_CONFIG, score_configs={}, label_configs={}, metric_configs={}, duration_config=_DEFAULT_DURATION_CONFIG, include_reasons=False, include_error_message=include_error_message, include_error_stacktrace=include_error_stacktrace, include_evaluator_failures=False, # Not applicable for failures table ) return renderer.build_failures_table(self) def __str__(self) -> str: # pragma: lax no cover """Return a string representation of the report.""" table = self.console_table() io_file = StringIO() Console(file=io_file).print(table) return io_file.getvalue() ``` #### name ```python name: str ``` The name of the report. #### cases ```python cases: list[ReportCase[InputsT, OutputT, MetadataT]] ``` The cases in the report. #### failures ```python failures: list[ ReportCaseFailure[InputsT, OutputT, MetadataT] ] = field(default_factory=list) ``` The failures in the report. These are cases where task execution raised an exception. #### trace_id ```python trace_id: str | None = None ``` The trace ID of the evaluation. #### span_id ```python span_id: str | None = None ``` The span ID of the evaluation. #### print ```python print( width: int | None = None, baseline: ( EvaluationReport[InputsT, OutputT, MetadataT] | None ) = None, *, include_input: bool = False, include_metadata: bool = False, include_expected_output: bool = False, include_output: bool = False, include_durations: bool = True, include_total_duration: bool = False, include_removed_cases: bool = False, include_averages: bool = True, include_errors: bool = True, include_error_stacktrace: bool = False, include_evaluator_failures: bool = True, input_config: RenderValueConfig | None = None, metadata_config: RenderValueConfig | None = None, output_config: RenderValueConfig | None = None, score_configs: ( dict[str, RenderNumberConfig] | None ) = None, label_configs: ( dict[str, RenderValueConfig] | None ) = None, metric_configs: ( dict[str, RenderNumberConfig] | None ) = None, duration_config: RenderNumberConfig | None = None, include_reasons: bool = False ) ``` Print this report to the console, optionally comparing it to a baseline report. If you want more control over the output, use `console_table` instead and pass it to `rich.Console.print`. Source code in `pydantic_evals/pydantic_evals/reporting/__init__.py` ```python def print( self, width: int | None = None, baseline: EvaluationReport[InputsT, OutputT, MetadataT] | None = None, *, include_input: bool = False, include_metadata: bool = False, include_expected_output: bool = False, include_output: bool = False, include_durations: bool = True, include_total_duration: bool = False, include_removed_cases: bool = False, include_averages: bool = True, include_errors: bool = True, include_error_stacktrace: bool = False, include_evaluator_failures: bool = True, input_config: RenderValueConfig | None = None, metadata_config: RenderValueConfig | None = None, output_config: RenderValueConfig | None = None, score_configs: dict[str, RenderNumberConfig] | None = None, label_configs: dict[str, RenderValueConfig] | None = None, metric_configs: dict[str, RenderNumberConfig] | None = None, duration_config: RenderNumberConfig | None = None, include_reasons: bool = False, ): # pragma: no cover """Print this report to the console, optionally comparing it to a baseline report. If you want more control over the output, use `console_table` instead and pass it to `rich.Console.print`. """ table = self.console_table( baseline=baseline, include_input=include_input, include_metadata=include_metadata, include_expected_output=include_expected_output, include_output=include_output, include_durations=include_durations, include_total_duration=include_total_duration, include_removed_cases=include_removed_cases, include_averages=include_averages, include_evaluator_failures=include_evaluator_failures, input_config=input_config, metadata_config=metadata_config, output_config=output_config, score_configs=score_configs, label_configs=label_configs, metric_configs=metric_configs, duration_config=duration_config, include_reasons=include_reasons, ) console = Console(width=width) console.print(table) if include_errors and self.failures: failures_table = self.failures_table( include_input=include_input, include_metadata=include_metadata, include_expected_output=include_expected_output, include_error_message=True, include_error_stacktrace=include_error_stacktrace, input_config=input_config, metadata_config=metadata_config, ) console.print(failures_table, style='red') ``` #### console_table ```python console_table( baseline: ( EvaluationReport[InputsT, OutputT, MetadataT] | None ) = None, *, include_input: bool = False, include_metadata: bool = False, include_expected_output: bool = False, include_output: bool = False, include_durations: bool = True, include_total_duration: bool = False, include_removed_cases: bool = False, include_averages: bool = True, include_evaluator_failures: bool = True, input_config: RenderValueConfig | None = None, metadata_config: RenderValueConfig | None = None, output_config: RenderValueConfig | None = None, score_configs: ( dict[str, RenderNumberConfig] | None ) = None, label_configs: ( dict[str, RenderValueConfig] | None ) = None, metric_configs: ( dict[str, RenderNumberConfig] | None ) = None, duration_config: RenderNumberConfig | None = None, include_reasons: bool = False ) -> Table ``` Return a table containing the data from this report, or the diff between this report and a baseline report. Optionally include input and output details. Source code in `pydantic_evals/pydantic_evals/reporting/__init__.py` ```python def console_table( self, baseline: EvaluationReport[InputsT, OutputT, MetadataT] | None = None, *, include_input: bool = False, include_metadata: bool = False, include_expected_output: bool = False, include_output: bool = False, include_durations: bool = True, include_total_duration: bool = False, include_removed_cases: bool = False, include_averages: bool = True, include_evaluator_failures: bool = True, input_config: RenderValueConfig | None = None, metadata_config: RenderValueConfig | None = None, output_config: RenderValueConfig | None = None, score_configs: dict[str, RenderNumberConfig] | None = None, label_configs: dict[str, RenderValueConfig] | None = None, metric_configs: dict[str, RenderNumberConfig] | None = None, duration_config: RenderNumberConfig | None = None, include_reasons: bool = False, ) -> Table: """Return a table containing the data from this report, or the diff between this report and a baseline report. Optionally include input and output details. """ renderer = EvaluationRenderer( include_input=include_input, include_metadata=include_metadata, include_expected_output=include_expected_output, include_output=include_output, include_durations=include_durations, include_total_duration=include_total_duration, include_removed_cases=include_removed_cases, include_averages=include_averages, include_error_message=False, include_error_stacktrace=False, include_evaluator_failures=include_evaluator_failures, input_config={**_DEFAULT_VALUE_CONFIG, **(input_config or {})}, metadata_config={**_DEFAULT_VALUE_CONFIG, **(metadata_config or {})}, output_config=output_config or _DEFAULT_VALUE_CONFIG, score_configs=score_configs or {}, label_configs=label_configs or {}, metric_configs=metric_configs or {}, duration_config=duration_config or _DEFAULT_DURATION_CONFIG, include_reasons=include_reasons, ) if baseline is None: return renderer.build_table(self) else: # pragma: no cover return renderer.build_diff_table(self, baseline) ``` #### failures_table ```python failures_table( *, include_input: bool = False, include_metadata: bool = False, include_expected_output: bool = False, include_error_message: bool = True, include_error_stacktrace: bool = True, input_config: RenderValueConfig | None = None, metadata_config: RenderValueConfig | None = None ) -> Table ``` Return a table containing the failures in this report. Source code in `pydantic_evals/pydantic_evals/reporting/__init__.py` ```python def failures_table( self, *, include_input: bool = False, include_metadata: bool = False, include_expected_output: bool = False, include_error_message: bool = True, include_error_stacktrace: bool = True, input_config: RenderValueConfig | None = None, metadata_config: RenderValueConfig | None = None, ) -> Table: """Return a table containing the failures in this report.""" renderer = EvaluationRenderer( include_input=include_input, include_metadata=include_metadata, include_expected_output=include_expected_output, include_output=False, include_durations=False, include_total_duration=False, include_removed_cases=False, include_averages=False, input_config={**_DEFAULT_VALUE_CONFIG, **(input_config or {})}, metadata_config={**_DEFAULT_VALUE_CONFIG, **(metadata_config or {})}, output_config=_DEFAULT_VALUE_CONFIG, score_configs={}, label_configs={}, metric_configs={}, duration_config=_DEFAULT_DURATION_CONFIG, include_reasons=False, include_error_message=include_error_message, include_error_stacktrace=include_error_stacktrace, include_evaluator_failures=False, # Not applicable for failures table ) return renderer.build_failures_table(self) ``` #### __str__ ```python __str__() -> str ``` Return a string representation of the report. Source code in `pydantic_evals/pydantic_evals/reporting/__init__.py` ```python def __str__(self) -> str: # pragma: lax no cover """Return a string representation of the report.""" table = self.console_table() io_file = StringIO() Console(file=io_file).print(table) return io_file.getvalue() ``` ### RenderValueConfig Bases: `TypedDict` A configuration for rendering a values in an Evaluation report. Source code in `pydantic_evals/pydantic_evals/reporting/__init__.py` ```python class RenderValueConfig(TypedDict, total=False): """A configuration for rendering a values in an Evaluation report.""" value_formatter: str | Callable[[Any], str] diff_checker: Callable[[Any, Any], bool] | None diff_formatter: Callable[[Any, Any], str | None] | None diff_style: str ``` ### RenderNumberConfig Bases: `TypedDict` A configuration for rendering a particular score or metric in an Evaluation report. See the implementation of `_RenderNumber` for more clarity on how these parameters affect the rendering. Source code in `pydantic_evals/pydantic_evals/reporting/__init__.py` ```python class RenderNumberConfig(TypedDict, total=False): """A configuration for rendering a particular score or metric in an Evaluation report. See the implementation of `_RenderNumber` for more clarity on how these parameters affect the rendering. """ value_formatter: str | Callable[[float | int], str] """The logic to use for formatting values. * If not provided, format as ints if all values are ints, otherwise at least one decimal place and at least four significant figures. * You can also use a custom string format spec, e.g. '{:.3f}' * You can also use a custom function, e.g. lambda x: f'{x:.3f}' """ diff_formatter: str | Callable[[float | int, float | int], str | None] | None """The logic to use for formatting details about the diff. The strings produced by the value_formatter will always be included in the reports, but the diff_formatter is used to produce additional text about the difference between the old and new values, such as the absolute or relative difference. * If not provided, format as ints if all values are ints, otherwise at least one decimal place and at least four significant figures, and will include the percentage change. * You can also use a custom string format spec, e.g. '{:+.3f}' * You can also use a custom function, e.g. lambda x: f'{x:+.3f}'. If this function returns None, no extra diff text will be added. * You can also use None to never generate extra diff text. """ diff_atol: float """The absolute tolerance for considering a difference "significant". A difference is "significant" if `abs(new - old) < self.diff_atol + self.diff_rtol * abs(old)`. If a difference is not significant, it will not have the diff styles applied. Note that we still show both the rendered before and after values in the diff any time they differ, even if the difference is not significant. (If the rendered values are exactly the same, we only show the value once.) If not provided, use 1e-6. """ diff_rtol: float """The relative tolerance for considering a difference "significant". See the description of `diff_atol` for more details about what makes a difference "significant". If not provided, use 0.001 if all values are ints, otherwise 0.05. """ diff_increase_style: str """The style to apply to diffed values that have a significant increase. See the description of `diff_atol` for more details about what makes a difference "significant". If not provided, use green for scores and red for metrics. You can also use arbitrary `rich` styles, such as "bold red". """ diff_decrease_style: str """The style to apply to diffed values that have significant decrease. See the description of `diff_atol` for more details about what makes a difference "significant". If not provided, use red for scores and green for metrics. You can also use arbitrary `rich` styles, such as "bold red". """ ``` #### value_formatter ```python value_formatter: str | Callable[[float | int], str] ``` The logic to use for formatting values. - If not provided, format as ints if all values are ints, otherwise at least one decimal place and at least four significant figures. - You can also use a custom string format spec, e.g. '{:.3f}' - You can also use a custom function, e.g. lambda x: f'{x:.3f}' #### diff_formatter ```python diff_formatter: ( str | Callable[[float | int, float | int], str | None] | None ) ``` The logic to use for formatting details about the diff. The strings produced by the value_formatter will always be included in the reports, but the diff_formatter is used to produce additional text about the difference between the old and new values, such as the absolute or relative difference. - If not provided, format as ints if all values are ints, otherwise at least one decimal place and at least four significant figures, and will include the percentage change. - You can also use a custom string format spec, e.g. '{:+.3f}' - You can also use a custom function, e.g. lambda x: f'{x:+.3f}'. If this function returns None, no extra diff text will be added. - You can also use None to never generate extra diff text. #### diff_atol ```python diff_atol: float ``` The absolute tolerance for considering a difference "significant". A difference is "significant" if `abs(new - old) < self.diff_atol + self.diff_rtol * abs(old)`. If a difference is not significant, it will not have the diff styles applied. Note that we still show both the rendered before and after values in the diff any time they differ, even if the difference is not significant. (If the rendered values are exactly the same, we only show the value once.) If not provided, use 1e-6. #### diff_rtol ```python diff_rtol: float ``` The relative tolerance for considering a difference "significant". See the description of `diff_atol` for more details about what makes a difference "significant". If not provided, use 0.001 if all values are ints, otherwise 0.05. #### diff_increase_style ```python diff_increase_style: str ``` The style to apply to diffed values that have a significant increase. See the description of `diff_atol` for more details about what makes a difference "significant". If not provided, use green for scores and red for metrics. You can also use arbitrary `rich` styles, such as "bold red". #### diff_decrease_style ```python diff_decrease_style: str ``` The style to apply to diffed values that have significant decrease. See the description of `diff_atol` for more details about what makes a difference "significant". If not provided, use red for scores and green for metrics. You can also use arbitrary `rich` styles, such as "bold red". ### EvaluationRenderer A class for rendering an EvalReport or the diff between two EvalReports. Source code in `pydantic_evals/pydantic_evals/reporting/__init__.py` ```python @dataclass(kw_only=True) class EvaluationRenderer: """A class for rendering an EvalReport or the diff between two EvalReports.""" # Columns to include include_input: bool include_metadata: bool include_expected_output: bool include_output: bool include_durations: bool include_total_duration: bool # Rows to include include_removed_cases: bool include_averages: bool input_config: RenderValueConfig metadata_config: RenderValueConfig output_config: RenderValueConfig score_configs: dict[str, RenderNumberConfig] label_configs: dict[str, RenderValueConfig] metric_configs: dict[str, RenderNumberConfig] duration_config: RenderNumberConfig # Data to include include_reasons: bool # only applies to reports, not to diffs include_error_message: bool include_error_stacktrace: bool include_evaluator_failures: bool def include_scores(self, report: EvaluationReport, baseline: EvaluationReport | None = None): return any(case.scores for case in self._all_cases(report, baseline)) def include_labels(self, report: EvaluationReport, baseline: EvaluationReport | None = None): return any(case.labels for case in self._all_cases(report, baseline)) def include_metrics(self, report: EvaluationReport, baseline: EvaluationReport | None = None): return any(case.metrics for case in self._all_cases(report, baseline)) def include_assertions(self, report: EvaluationReport, baseline: EvaluationReport | None = None): return any(case.assertions for case in self._all_cases(report, baseline)) def include_evaluator_failures_column(self, report: EvaluationReport, baseline: EvaluationReport | None = None): return self.include_evaluator_failures and any( case.evaluator_failures for case in self._all_cases(report, baseline) ) def _all_cases(self, report: EvaluationReport, baseline: EvaluationReport | None) -> list[ReportCase]: if not baseline: return report.cases else: return report.cases + self._baseline_cases_to_include(report, baseline) def _baseline_cases_to_include(self, report: EvaluationReport, baseline: EvaluationReport) -> list[ReportCase]: if self.include_removed_cases: return baseline.cases report_case_names = {case.name for case in report.cases} return [case for case in baseline.cases if case.name in report_case_names] def _get_case_renderer( self, report: EvaluationReport, baseline: EvaluationReport | None = None ) -> ReportCaseRenderer: input_renderer = _ValueRenderer.from_config(self.input_config) metadata_renderer = _ValueRenderer.from_config(self.metadata_config) output_renderer = _ValueRenderer.from_config(self.output_config) score_renderers = self._infer_score_renderers(report, baseline) label_renderers = self._infer_label_renderers(report, baseline) metric_renderers = self._infer_metric_renderers(report, baseline) duration_renderer = _NumberRenderer.infer_from_config( self.duration_config, 'duration', [x.task_duration for x in self._all_cases(report, baseline)] ) return ReportCaseRenderer( include_input=self.include_input, include_metadata=self.include_metadata, include_expected_output=self.include_expected_output, include_output=self.include_output, include_scores=self.include_scores(report, baseline), include_labels=self.include_labels(report, baseline), include_metrics=self.include_metrics(report, baseline), include_assertions=self.include_assertions(report, baseline), include_reasons=self.include_reasons, include_durations=self.include_durations, include_total_duration=self.include_total_duration, include_error_message=self.include_error_message, include_error_stacktrace=self.include_error_stacktrace, include_evaluator_failures=self.include_evaluator_failures_column(report, baseline), input_renderer=input_renderer, metadata_renderer=metadata_renderer, output_renderer=output_renderer, score_renderers=score_renderers, label_renderers=label_renderers, metric_renderers=metric_renderers, duration_renderer=duration_renderer, ) def build_table(self, report: EvaluationReport) -> Table: case_renderer = self._get_case_renderer(report) table = case_renderer.build_base_table(f'Evaluation Summary: {report.name}') for case in report.cases: table.add_row(*case_renderer.build_row(case)) if self.include_averages: # pragma: no branch average = report.averages() if average: # pragma: no branch table.add_row(*case_renderer.build_aggregate_row(average)) return table def build_diff_table(self, report: EvaluationReport, baseline: EvaluationReport) -> Table: report_cases = report.cases baseline_cases = self._baseline_cases_to_include(report, baseline) report_cases_by_id = {case.name: case for case in report_cases} baseline_cases_by_id = {case.name: case for case in baseline_cases} diff_cases: list[tuple[ReportCase, ReportCase]] = [] removed_cases: list[ReportCase] = [] added_cases: list[ReportCase] = [] for case_id in sorted(set(baseline_cases_by_id.keys()) | set(report_cases_by_id.keys())): maybe_baseline_case = baseline_cases_by_id.get(case_id) maybe_report_case = report_cases_by_id.get(case_id) if maybe_baseline_case and maybe_report_case: diff_cases.append((maybe_baseline_case, maybe_report_case)) elif maybe_baseline_case: removed_cases.append(maybe_baseline_case) elif maybe_report_case: added_cases.append(maybe_report_case) else: # pragma: no cover assert False, 'This should be unreachable' case_renderer = self._get_case_renderer(report, baseline) diff_name = baseline.name if baseline.name == report.name else f'{baseline.name} → {report.name}' table = case_renderer.build_base_table(f'Evaluation Diff: {diff_name}') for baseline_case, new_case in diff_cases: table.add_row(*case_renderer.build_diff_row(new_case, baseline_case)) for case in added_cases: row = case_renderer.build_row(case) row[0] = f'[green]+ Added Case[/]\n{row[0]}' table.add_row(*row) for case in removed_cases: row = case_renderer.build_row(case) row[0] = f'[red]- Removed Case[/]\n{row[0]}' table.add_row(*row) if self.include_averages: # pragma: no branch report_average = ReportCaseAggregate.average(report_cases) baseline_average = ReportCaseAggregate.average(baseline_cases) table.add_row(*case_renderer.build_diff_aggregate_row(report_average, baseline_average)) return table def build_failures_table(self, report: EvaluationReport) -> Table: case_renderer = self._get_case_renderer(report) table = case_renderer.build_failures_table('Case Failures') for case in report.failures: table.add_row(*case_renderer.build_failure_row(case)) return table def _infer_score_renderers( self, report: EvaluationReport, baseline: EvaluationReport | None ) -> dict[str, _NumberRenderer]: all_cases = self._all_cases(report, baseline) values_by_name: dict[str, list[float | int]] = {} for case in all_cases: for k, score in case.scores.items(): values_by_name.setdefault(k, []).append(score.value) all_renderers: dict[str, _NumberRenderer] = {} for name, values in values_by_name.items(): merged_config = _DEFAULT_NUMBER_CONFIG.copy() merged_config.update(self.score_configs.get(name, {})) all_renderers[name] = _NumberRenderer.infer_from_config(merged_config, 'score', values) return all_renderers def _infer_label_renderers( self, report: EvaluationReport, baseline: EvaluationReport | None ) -> dict[str, _ValueRenderer]: all_cases = self._all_cases(report, baseline) all_names: set[str] = set() for case in all_cases: for k in case.labels: all_names.add(k) all_renderers: dict[str, _ValueRenderer] = {} for name in all_names: merged_config = _DEFAULT_VALUE_CONFIG.copy() merged_config.update(self.label_configs.get(name, {})) all_renderers[name] = _ValueRenderer.from_config(merged_config) return all_renderers def _infer_metric_renderers( self, report: EvaluationReport, baseline: EvaluationReport | None ) -> dict[str, _NumberRenderer]: all_cases = self._all_cases(report, baseline) values_by_name: dict[str, list[float | int]] = {} for case in all_cases: for k, v in case.metrics.items(): values_by_name.setdefault(k, []).append(v) all_renderers: dict[str, _NumberRenderer] = {} for name, values in values_by_name.items(): merged_config = _DEFAULT_NUMBER_CONFIG.copy() merged_config.update(self.metric_configs.get(name, {})) all_renderers[name] = _NumberRenderer.infer_from_config(merged_config, 'metric', values) return all_renderers def _infer_duration_renderer( self, report: EvaluationReport, baseline: EvaluationReport | None ) -> _NumberRenderer: # pragma: no cover all_cases = self._all_cases(report, baseline) all_durations = [x.task_duration for x in all_cases] if self.include_total_duration: all_durations += [x.total_duration for x in all_cases] return _NumberRenderer.infer_from_config(self.duration_config, 'duration', all_durations) ``` # `pydantic_graph.exceptions` ### GraphSetupError Bases: `TypeError` Error caused by an incorrectly configured graph. Source code in `pydantic_graph/pydantic_graph/exceptions.py` ```python class GraphSetupError(TypeError): """Error caused by an incorrectly configured graph.""" message: str """Description of the mistake.""" def __init__(self, message: str): self.message = message super().__init__(message) ``` #### message ```python message: str = message ``` Description of the mistake. ### GraphRuntimeError Bases: `RuntimeError` Error caused by an issue during graph execution. Source code in `pydantic_graph/pydantic_graph/exceptions.py` ```python class GraphRuntimeError(RuntimeError): """Error caused by an issue during graph execution.""" message: str """The error message.""" def __init__(self, message: str): self.message = message super().__init__(message) ``` #### message ```python message: str = message ``` The error message. ### GraphNodeStatusError Bases: `GraphRuntimeError` Error caused by trying to run a node that already has status `'running'`, `'success'`, or `'error'`. Source code in `pydantic_graph/pydantic_graph/exceptions.py` ```python class GraphNodeStatusError(GraphRuntimeError): """Error caused by trying to run a node that already has status `'running'`, `'success'`, or `'error'`.""" def __init__(self, actual_status: 'SnapshotStatus'): self.actual_status = actual_status super().__init__(f"Incorrect snapshot status {actual_status!r}, must be 'created' or 'pending'.") @classmethod def check(cls, status: 'SnapshotStatus') -> None: """Check if the status is valid.""" if status not in {'created', 'pending'}: raise cls(status) ``` #### check ```python check(status: SnapshotStatus) -> None ``` Check if the status is valid. Source code in `pydantic_graph/pydantic_graph/exceptions.py` ```python @classmethod def check(cls, status: 'SnapshotStatus') -> None: """Check if the status is valid.""" if status not in {'created', 'pending'}: raise cls(status) ``` # `pydantic_graph` ### Graph Bases: `Generic[StateT, DepsT, RunEndT]` Definition of a graph. In `pydantic-graph`, a graph is a collection of nodes that can be run in sequence. The nodes define their outgoing edges — e.g. which nodes may be run next, and thereby the structure of the graph. Here's a very simple example of a graph which increments a number by 1, but makes sure the number is never 42 at the end. never_42.py ```py from __future__ import annotations from dataclasses import dataclass from pydantic_graph import BaseNode, End, Graph, GraphRunContext @dataclass class MyState: number: int @dataclass class Increment(BaseNode[MyState]): async def run(self, ctx: GraphRunContext) -> Check42: ctx.state.number += 1 return Check42() @dataclass class Check42(BaseNode[MyState, None, int]): async def run(self, ctx: GraphRunContext) -> Increment | End[int]: if ctx.state.number == 42: return Increment() else: return End(ctx.state.number) never_42_graph = Graph(nodes=(Increment, Check42)) ``` *(This example is complete, it can be run "as is")* See run For an example of running graph, and mermaid_code for an example of generating a mermaid diagram from the graph. Source code in `pydantic_graph/pydantic_graph/graph.py` ````python @dataclass(init=False) class Graph(Generic[StateT, DepsT, RunEndT]): """Definition of a graph. In `pydantic-graph`, a graph is a collection of nodes that can be run in sequence. The nodes define their outgoing edges — e.g. which nodes may be run next, and thereby the structure of the graph. Here's a very simple example of a graph which increments a number by 1, but makes sure the number is never 42 at the end. ```py {title="never_42.py" noqa="I001"} from __future__ import annotations from dataclasses import dataclass from pydantic_graph import BaseNode, End, Graph, GraphRunContext @dataclass class MyState: number: int @dataclass class Increment(BaseNode[MyState]): async def run(self, ctx: GraphRunContext) -> Check42: ctx.state.number += 1 return Check42() @dataclass class Check42(BaseNode[MyState, None, int]): async def run(self, ctx: GraphRunContext) -> Increment | End[int]: if ctx.state.number == 42: return Increment() else: return End(ctx.state.number) never_42_graph = Graph(nodes=(Increment, Check42)) ``` _(This example is complete, it can be run "as is")_ See [`run`][pydantic_graph.graph.Graph.run] For an example of running graph, and [`mermaid_code`][pydantic_graph.graph.Graph.mermaid_code] for an example of generating a mermaid diagram from the graph. """ name: str | None node_defs: dict[str, NodeDef[StateT, DepsT, RunEndT]] _state_type: type[StateT] | _utils.Unset = field(repr=False) _run_end_type: type[RunEndT] | _utils.Unset = field(repr=False) auto_instrument: bool = field(repr=False) def __init__( self, *, nodes: Sequence[type[BaseNode[StateT, DepsT, RunEndT]]], name: str | None = None, state_type: type[StateT] | _utils.Unset = _utils.UNSET, run_end_type: type[RunEndT] | _utils.Unset = _utils.UNSET, auto_instrument: bool = True, ): """Create a graph from a sequence of nodes. Args: nodes: The nodes which make up the graph, nodes need to be unique and all be generic in the same state type. name: Optional name for the graph, if not provided the name will be inferred from the calling frame on the first call to a graph method. state_type: The type of the state for the graph, this can generally be inferred from `nodes`. run_end_type: The type of the result of running the graph, this can generally be inferred from `nodes`. auto_instrument: Whether to create a span for the graph run and the execution of each node's run method. """ self.name = name self._state_type = state_type self._run_end_type = run_end_type self.auto_instrument = auto_instrument parent_namespace = _utils.get_parent_namespace(inspect.currentframe()) self.node_defs = {} for node in nodes: self._register_node(node, parent_namespace) self._validate_edges() async def run( self, start_node: BaseNode[StateT, DepsT, RunEndT], *, state: StateT = None, deps: DepsT = None, persistence: BaseStatePersistence[StateT, RunEndT] | None = None, infer_name: bool = True, ) -> GraphRunResult[StateT, RunEndT]: """Run the graph from a starting node until it ends. Args: start_node: the first node to run, since the graph definition doesn't define the entry point in the graph, you need to provide the starting node. state: The initial state of the graph. deps: The dependencies of the graph. persistence: State persistence interface, defaults to [`SimpleStatePersistence`][pydantic_graph.SimpleStatePersistence] if `None`. infer_name: Whether to infer the graph name from the calling frame. Returns: A `GraphRunResult` containing information about the run, including its final result. Here's an example of running the graph from [above][pydantic_graph.graph.Graph]: ```py {title="run_never_42.py" noqa="I001" requires="never_42.py"} from never_42 import Increment, MyState, never_42_graph async def main(): state = MyState(1) await never_42_graph.run(Increment(), state=state) print(state) #> MyState(number=2) state = MyState(41) await never_42_graph.run(Increment(), state=state) print(state) #> MyState(number=43) ``` """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) async with self.iter( start_node, state=state, deps=deps, persistence=persistence, infer_name=False ) as graph_run: async for _node in graph_run: pass result = graph_run.result assert result is not None, 'GraphRun should have a result' return result def run_sync( self, start_node: BaseNode[StateT, DepsT, RunEndT], *, state: StateT = None, deps: DepsT = None, persistence: BaseStatePersistence[StateT, RunEndT] | None = None, infer_name: bool = True, ) -> GraphRunResult[StateT, RunEndT]: """Synchronously run the graph. This is a convenience method that wraps [`self.run`][pydantic_graph.Graph.run] with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. Args: start_node: the first node to run, since the graph definition doesn't define the entry point in the graph, you need to provide the starting node. state: The initial state of the graph. deps: The dependencies of the graph. persistence: State persistence interface, defaults to [`SimpleStatePersistence`][pydantic_graph.SimpleStatePersistence] if `None`. infer_name: Whether to infer the graph name from the calling frame. Returns: The result type from ending the run and the history of the run. """ if infer_name and self.name is None: # pragma: no branch self._infer_name(inspect.currentframe()) return _utils.get_event_loop().run_until_complete( self.run(start_node, state=state, deps=deps, persistence=persistence, infer_name=False) ) @asynccontextmanager async def iter( self, start_node: BaseNode[StateT, DepsT, RunEndT], *, state: StateT = None, deps: DepsT = None, persistence: BaseStatePersistence[StateT, RunEndT] | None = None, span: AbstractContextManager[AbstractSpan] | None = None, infer_name: bool = True, ) -> AsyncIterator[GraphRun[StateT, DepsT, RunEndT]]: """A contextmanager which can be used to iterate over the graph's nodes as they are executed. This method returns a `GraphRun` object which can be used to async-iterate over the nodes of this `Graph` as they are executed. This is the API to use if you want to record or interact with the nodes as the graph execution unfolds. The `GraphRun` can also be used to manually drive the graph execution by calling [`GraphRun.next`][pydantic_graph.graph.GraphRun.next]. The `GraphRun` provides access to the full run history, state, deps, and the final result of the run once it has completed. For more details, see the API documentation of [`GraphRun`][pydantic_graph.graph.GraphRun]. Args: start_node: the first node to run. Since the graph definition doesn't define the entry point in the graph, you need to provide the starting node. state: The initial state of the graph. deps: The dependencies of the graph. persistence: State persistence interface, defaults to [`SimpleStatePersistence`][pydantic_graph.SimpleStatePersistence] if `None`. span: The span to use for the graph run. If not provided, a new span will be created. infer_name: Whether to infer the graph name from the calling frame. Returns: A GraphRun that can be async iterated over to drive the graph to completion. """ if infer_name and self.name is None: # f_back because `asynccontextmanager` adds one frame if frame := inspect.currentframe(): # pragma: no branch self._infer_name(frame.f_back) if persistence is None: persistence = SimpleStatePersistence() persistence.set_graph_types(self) with ExitStack() as stack: entered_span: AbstractSpan | None = None if span is None: if self.auto_instrument: entered_span = stack.enter_context(logfire_span('run graph {graph.name}', graph=self)) else: entered_span = stack.enter_context(span) traceparent = None if entered_span is None else get_traceparent(entered_span) yield GraphRun[StateT, DepsT, RunEndT]( graph=self, start_node=start_node, persistence=persistence, state=state, deps=deps, traceparent=traceparent, ) @asynccontextmanager async def iter_from_persistence( self, persistence: BaseStatePersistence[StateT, RunEndT], *, deps: DepsT = None, span: AbstractContextManager[AbstractSpan] | None = None, infer_name: bool = True, ) -> AsyncIterator[GraphRun[StateT, DepsT, RunEndT]]: """A contextmanager to iterate over the graph's nodes as they are executed, created from a persistence object. This method has similar functionality to [`iter`][pydantic_graph.graph.Graph.iter], but instead of passing the node to run, it will restore the node and state from state persistence. Args: persistence: The state persistence interface to use. deps: The dependencies of the graph. span: The span to use for the graph run. If not provided, a new span will be created. infer_name: Whether to infer the graph name from the calling frame. Returns: A GraphRun that can be async iterated over to drive the graph to completion. """ if infer_name and self.name is None: # f_back because `asynccontextmanager` adds one frame if frame := inspect.currentframe(): # pragma: no branch self._infer_name(frame.f_back) persistence.set_graph_types(self) snapshot = await persistence.load_next() if snapshot is None: raise exceptions.GraphRuntimeError('Unable to restore snapshot from state persistence.') snapshot.node.set_snapshot_id(snapshot.id) if self.auto_instrument and span is None: # pragma: no branch span = logfire_span('run graph {graph.name}', graph=self) with ExitStack() as stack: entered_span = None if span is None else stack.enter_context(span) traceparent = None if entered_span is None else get_traceparent(entered_span) yield GraphRun[StateT, DepsT, RunEndT]( graph=self, start_node=snapshot.node, persistence=persistence, state=snapshot.state, deps=deps, snapshot_id=snapshot.id, traceparent=traceparent, ) async def initialize( self, node: BaseNode[StateT, DepsT, RunEndT], persistence: BaseStatePersistence[StateT, RunEndT], *, state: StateT = None, infer_name: bool = True, ) -> None: """Initialize a new graph run in persistence without running it. This is useful if you want to set up a graph run to be run later, e.g. via [`iter_from_persistence`][pydantic_graph.graph.Graph.iter_from_persistence]. Args: node: The node to run first. persistence: State persistence interface. state: The start state of the graph. infer_name: Whether to infer the graph name from the calling frame. """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) persistence.set_graph_types(self) await persistence.snapshot_node(state, node) def mermaid_code( self, *, start_node: Sequence[mermaid.NodeIdent] | mermaid.NodeIdent | None = None, title: str | None | typing_extensions.Literal[False] = None, edge_labels: bool = True, notes: bool = True, highlighted_nodes: Sequence[mermaid.NodeIdent] | mermaid.NodeIdent | None = None, highlight_css: str = mermaid.DEFAULT_HIGHLIGHT_CSS, infer_name: bool = True, direction: mermaid.StateDiagramDirection | None = None, ) -> str: """Generate a diagram representing the graph as [mermaid](https://mermaid.js.org/) diagram. This method calls [`pydantic_graph.mermaid.generate_code`][pydantic_graph.mermaid.generate_code]. Args: start_node: The node or nodes which can start the graph. title: The title of the diagram, use `False` to not include a title. edge_labels: Whether to include edge labels. notes: Whether to include notes on each node. highlighted_nodes: Optional node or nodes to highlight. highlight_css: The CSS to use for highlighting nodes. infer_name: Whether to infer the graph name from the calling frame. direction: The direction of flow. Returns: The mermaid code for the graph, which can then be rendered as a diagram. Here's an example of generating a diagram for the graph from [above][pydantic_graph.graph.Graph]: ```py {title="mermaid_never_42.py" requires="never_42.py"} from never_42 import Increment, never_42_graph print(never_42_graph.mermaid_code(start_node=Increment)) ''' --- title: never_42_graph --- stateDiagram-v2 [*] --> Increment Increment --> Check42 Check42 --> Increment Check42 --> [*] ''' ``` The rendered diagram will look like this: ```mermaid --- title: never_42_graph --- stateDiagram-v2 [*] --> Increment Increment --> Check42 Check42 --> Increment Check42 --> [*] ``` """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) if title is None and self.name: title = self.name return mermaid.generate_code( self, start_node=start_node, highlighted_nodes=highlighted_nodes, highlight_css=highlight_css, title=title or None, edge_labels=edge_labels, notes=notes, direction=direction, ) def mermaid_image( self, infer_name: bool = True, **kwargs: typing_extensions.Unpack[mermaid.MermaidConfig] ) -> bytes: """Generate a diagram representing the graph as an image. The format and diagram can be customized using `kwargs`, see [`pydantic_graph.mermaid.MermaidConfig`][pydantic_graph.mermaid.MermaidConfig]. !!! note "Uses external service" This method makes a request to [mermaid.ink](https://mermaid.ink) to render the image, `mermaid.ink` is a free service not affiliated with Pydantic. Args: infer_name: Whether to infer the graph name from the calling frame. **kwargs: Additional arguments to pass to `mermaid.request_image`. Returns: The image bytes. """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) if 'title' not in kwargs and self.name: kwargs['title'] = self.name return mermaid.request_image(self, **kwargs) def mermaid_save( self, path: Path | str, /, *, infer_name: bool = True, **kwargs: typing_extensions.Unpack[mermaid.MermaidConfig] ) -> None: """Generate a diagram representing the graph and save it as an image. The format and diagram can be customized using `kwargs`, see [`pydantic_graph.mermaid.MermaidConfig`][pydantic_graph.mermaid.MermaidConfig]. !!! note "Uses external service" This method makes a request to [mermaid.ink](https://mermaid.ink) to render the image, `mermaid.ink` is a free service not affiliated with Pydantic. Args: path: The path to save the image to. infer_name: Whether to infer the graph name from the calling frame. **kwargs: Additional arguments to pass to `mermaid.save_image`. """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) if 'title' not in kwargs and self.name: kwargs['title'] = self.name mermaid.save_image(path, self, **kwargs) def get_nodes(self) -> Sequence[type[BaseNode[StateT, DepsT, RunEndT]]]: """Get the nodes in the graph.""" return [node_def.node for node_def in self.node_defs.values()] @cached_property def inferred_types(self) -> tuple[type[StateT], type[RunEndT]]: # Get the types of the state and run end from the graph. if _utils.is_set(self._state_type) and _utils.is_set(self._run_end_type): return self._state_type, self._run_end_type state_type = self._state_type run_end_type = self._run_end_type for node_def in self.node_defs.values(): for base in typing_extensions.get_original_bases(node_def.node): if typing_extensions.get_origin(base) is BaseNode: args = typing_extensions.get_args(base) if not _utils.is_set(state_type) and args: state_type = args[0] if not _utils.is_set(run_end_type) and len(args) == 3: t = args[2] if not typing_objects.is_never(t): run_end_type = t if _utils.is_set(state_type) and _utils.is_set(run_end_type): return state_type, run_end_type # pyright: ignore[reportReturnType] # break the inner (bases) loop break if not _utils.is_set(state_type): # pragma: no branch # state defaults to None, so use that if we can't infer it state_type = None if not _utils.is_set(run_end_type): # this happens if a graph has no return nodes, use None so any downstream errors are clear run_end_type = None return state_type, run_end_type # pyright: ignore[reportReturnType] def _register_node( self, node: type[BaseNode[StateT, DepsT, RunEndT]], parent_namespace: dict[str, Any] | None, ) -> None: node_id = node.get_node_id() if existing_node := self.node_defs.get(node_id): raise exceptions.GraphSetupError( f'Node ID `{node_id}` is not unique — found on {existing_node.node} and {node}' ) else: self.node_defs[node_id] = node.get_node_def(parent_namespace) def _validate_edges(self): known_node_ids = self.node_defs.keys() bad_edges: dict[str, list[str]] = {} for node_id, node_def in self.node_defs.items(): for edge in node_def.next_node_edges.keys(): if edge not in known_node_ids: bad_edges.setdefault(edge, []).append(f'`{node_id}`') if bad_edges: bad_edges_list = [f'`{k}` is referenced by {_utils.comma_and(v)}' for k, v in bad_edges.items()] if len(bad_edges_list) == 1: raise exceptions.GraphSetupError(f'{bad_edges_list[0]} but not included in the graph.') else: b = '\n'.join(f' {be}' for be in bad_edges_list) raise exceptions.GraphSetupError( f'Nodes are referenced in the graph but not included in the graph:\n{b}' ) def _infer_name(self, function_frame: types.FrameType | None) -> None: """Infer the agent name from the call frame. Usage should be `self._infer_name(inspect.currentframe())`. Copied from `Agent`. """ assert self.name is None, 'Name already set' if function_frame is not None and (parent_frame := function_frame.f_back): # pragma: no branch for name, item in parent_frame.f_locals.items(): if item is self: self.name = name return if parent_frame.f_locals != parent_frame.f_globals: # pragma: no branch # if we couldn't find the agent in locals and globals are a different dict, try globals for name, item in parent_frame.f_globals.items(): # pragma: no branch if item is self: self.name = name return ```` #### __init__ ```python __init__( *, nodes: Sequence[type[BaseNode[StateT, DepsT, RunEndT]]], name: str | None = None, state_type: type[StateT] | Unset = UNSET, run_end_type: type[RunEndT] | Unset = UNSET, auto_instrument: bool = True ) ``` Create a graph from a sequence of nodes. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `nodes` | `Sequence[type[BaseNode[StateT, DepsT, RunEndT]]]` | The nodes which make up the graph, nodes need to be unique and all be generic in the same state type. | *required* | | `name` | `str | None` | Optional name for the graph, if not provided the name will be inferred from the calling frame on the first call to a graph method. | `None` | | `state_type` | `type[StateT] | Unset` | The type of the state for the graph, this can generally be inferred from nodes. | `UNSET` | | `run_end_type` | `type[RunEndT] | Unset` | The type of the result of running the graph, this can generally be inferred from nodes. | `UNSET` | | `auto_instrument` | `bool` | Whether to create a span for the graph run and the execution of each node's run method. | `True` | Source code in `pydantic_graph/pydantic_graph/graph.py` ```python def __init__( self, *, nodes: Sequence[type[BaseNode[StateT, DepsT, RunEndT]]], name: str | None = None, state_type: type[StateT] | _utils.Unset = _utils.UNSET, run_end_type: type[RunEndT] | _utils.Unset = _utils.UNSET, auto_instrument: bool = True, ): """Create a graph from a sequence of nodes. Args: nodes: The nodes which make up the graph, nodes need to be unique and all be generic in the same state type. name: Optional name for the graph, if not provided the name will be inferred from the calling frame on the first call to a graph method. state_type: The type of the state for the graph, this can generally be inferred from `nodes`. run_end_type: The type of the result of running the graph, this can generally be inferred from `nodes`. auto_instrument: Whether to create a span for the graph run and the execution of each node's run method. """ self.name = name self._state_type = state_type self._run_end_type = run_end_type self.auto_instrument = auto_instrument parent_namespace = _utils.get_parent_namespace(inspect.currentframe()) self.node_defs = {} for node in nodes: self._register_node(node, parent_namespace) self._validate_edges() ``` #### run ```python run( start_node: BaseNode[StateT, DepsT, RunEndT], *, state: StateT = None, deps: DepsT = None, persistence: ( BaseStatePersistence[StateT, RunEndT] | None ) = None, infer_name: bool = True ) -> GraphRunResult[StateT, RunEndT] ``` Run the graph from a starting node until it ends. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `start_node` | `BaseNode[StateT, DepsT, RunEndT]` | the first node to run, since the graph definition doesn't define the entry point in the graph, you need to provide the starting node. | *required* | | `state` | `StateT` | The initial state of the graph. | `None` | | `deps` | `DepsT` | The dependencies of the graph. | `None` | | `persistence` | `BaseStatePersistence[StateT, RunEndT] | None` | State persistence interface, defaults to SimpleStatePersistence if None. | `None` | | `infer_name` | `bool` | Whether to infer the graph name from the calling frame. | `True` | Returns: | Type | Description | | --- | --- | | `GraphRunResult[StateT, RunEndT]` | A GraphRunResult containing information about the run, including its final result. | Here's an example of running the graph from above: run_never_42.py ```py from never_42 import Increment, MyState, never_42_graph async def main(): state = MyState(1) await never_42_graph.run(Increment(), state=state) print(state) #> MyState(number=2) state = MyState(41) await never_42_graph.run(Increment(), state=state) print(state) #> MyState(number=43) ``` Source code in `pydantic_graph/pydantic_graph/graph.py` ````python async def run( self, start_node: BaseNode[StateT, DepsT, RunEndT], *, state: StateT = None, deps: DepsT = None, persistence: BaseStatePersistence[StateT, RunEndT] | None = None, infer_name: bool = True, ) -> GraphRunResult[StateT, RunEndT]: """Run the graph from a starting node until it ends. Args: start_node: the first node to run, since the graph definition doesn't define the entry point in the graph, you need to provide the starting node. state: The initial state of the graph. deps: The dependencies of the graph. persistence: State persistence interface, defaults to [`SimpleStatePersistence`][pydantic_graph.SimpleStatePersistence] if `None`. infer_name: Whether to infer the graph name from the calling frame. Returns: A `GraphRunResult` containing information about the run, including its final result. Here's an example of running the graph from [above][pydantic_graph.graph.Graph]: ```py {title="run_never_42.py" noqa="I001" requires="never_42.py"} from never_42 import Increment, MyState, never_42_graph async def main(): state = MyState(1) await never_42_graph.run(Increment(), state=state) print(state) #> MyState(number=2) state = MyState(41) await never_42_graph.run(Increment(), state=state) print(state) #> MyState(number=43) ``` """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) async with self.iter( start_node, state=state, deps=deps, persistence=persistence, infer_name=False ) as graph_run: async for _node in graph_run: pass result = graph_run.result assert result is not None, 'GraphRun should have a result' return result ```` #### run_sync ```python run_sync( start_node: BaseNode[StateT, DepsT, RunEndT], *, state: StateT = None, deps: DepsT = None, persistence: ( BaseStatePersistence[StateT, RunEndT] | None ) = None, infer_name: bool = True ) -> GraphRunResult[StateT, RunEndT] ``` Synchronously run the graph. This is a convenience method that wraps self.run with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `start_node` | `BaseNode[StateT, DepsT, RunEndT]` | the first node to run, since the graph definition doesn't define the entry point in the graph, you need to provide the starting node. | *required* | | `state` | `StateT` | The initial state of the graph. | `None` | | `deps` | `DepsT` | The dependencies of the graph. | `None` | | `persistence` | `BaseStatePersistence[StateT, RunEndT] | None` | State persistence interface, defaults to SimpleStatePersistence if None. | `None` | | `infer_name` | `bool` | Whether to infer the graph name from the calling frame. | `True` | Returns: | Type | Description | | --- | --- | | `GraphRunResult[StateT, RunEndT]` | The result type from ending the run and the history of the run. | Source code in `pydantic_graph/pydantic_graph/graph.py` ```python def run_sync( self, start_node: BaseNode[StateT, DepsT, RunEndT], *, state: StateT = None, deps: DepsT = None, persistence: BaseStatePersistence[StateT, RunEndT] | None = None, infer_name: bool = True, ) -> GraphRunResult[StateT, RunEndT]: """Synchronously run the graph. This is a convenience method that wraps [`self.run`][pydantic_graph.Graph.run] with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. Args: start_node: the first node to run, since the graph definition doesn't define the entry point in the graph, you need to provide the starting node. state: The initial state of the graph. deps: The dependencies of the graph. persistence: State persistence interface, defaults to [`SimpleStatePersistence`][pydantic_graph.SimpleStatePersistence] if `None`. infer_name: Whether to infer the graph name from the calling frame. Returns: The result type from ending the run and the history of the run. """ if infer_name and self.name is None: # pragma: no branch self._infer_name(inspect.currentframe()) return _utils.get_event_loop().run_until_complete( self.run(start_node, state=state, deps=deps, persistence=persistence, infer_name=False) ) ``` #### iter ```python iter( start_node: BaseNode[StateT, DepsT, RunEndT], *, state: StateT = None, deps: DepsT = None, persistence: ( BaseStatePersistence[StateT, RunEndT] | None ) = None, span: ( AbstractContextManager[AbstractSpan] | None ) = None, infer_name: bool = True ) -> AsyncIterator[GraphRun[StateT, DepsT, RunEndT]] ``` A contextmanager which can be used to iterate over the graph's nodes as they are executed. This method returns a `GraphRun` object which can be used to async-iterate over the nodes of this `Graph` as they are executed. This is the API to use if you want to record or interact with the nodes as the graph execution unfolds. The `GraphRun` can also be used to manually drive the graph execution by calling GraphRun.next. The `GraphRun` provides access to the full run history, state, deps, and the final result of the run once it has completed. For more details, see the API documentation of GraphRun. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `start_node` | `BaseNode[StateT, DepsT, RunEndT]` | the first node to run. Since the graph definition doesn't define the entry point in the graph, you need to provide the starting node. | *required* | | `state` | `StateT` | The initial state of the graph. | `None` | | `deps` | `DepsT` | The dependencies of the graph. | `None` | | `persistence` | `BaseStatePersistence[StateT, RunEndT] | None` | State persistence interface, defaults to SimpleStatePersistence if None. | `None` | | `span` | `AbstractContextManager[AbstractSpan] | None` | The span to use for the graph run. If not provided, a new span will be created. | `None` | | `infer_name` | `bool` | Whether to infer the graph name from the calling frame. | `True` | Returns: A GraphRun that can be async iterated over to drive the graph to completion. Source code in `pydantic_graph/pydantic_graph/graph.py` ```python @asynccontextmanager async def iter( self, start_node: BaseNode[StateT, DepsT, RunEndT], *, state: StateT = None, deps: DepsT = None, persistence: BaseStatePersistence[StateT, RunEndT] | None = None, span: AbstractContextManager[AbstractSpan] | None = None, infer_name: bool = True, ) -> AsyncIterator[GraphRun[StateT, DepsT, RunEndT]]: """A contextmanager which can be used to iterate over the graph's nodes as they are executed. This method returns a `GraphRun` object which can be used to async-iterate over the nodes of this `Graph` as they are executed. This is the API to use if you want to record or interact with the nodes as the graph execution unfolds. The `GraphRun` can also be used to manually drive the graph execution by calling [`GraphRun.next`][pydantic_graph.graph.GraphRun.next]. The `GraphRun` provides access to the full run history, state, deps, and the final result of the run once it has completed. For more details, see the API documentation of [`GraphRun`][pydantic_graph.graph.GraphRun]. Args: start_node: the first node to run. Since the graph definition doesn't define the entry point in the graph, you need to provide the starting node. state: The initial state of the graph. deps: The dependencies of the graph. persistence: State persistence interface, defaults to [`SimpleStatePersistence`][pydantic_graph.SimpleStatePersistence] if `None`. span: The span to use for the graph run. If not provided, a new span will be created. infer_name: Whether to infer the graph name from the calling frame. Returns: A GraphRun that can be async iterated over to drive the graph to completion. """ if infer_name and self.name is None: # f_back because `asynccontextmanager` adds one frame if frame := inspect.currentframe(): # pragma: no branch self._infer_name(frame.f_back) if persistence is None: persistence = SimpleStatePersistence() persistence.set_graph_types(self) with ExitStack() as stack: entered_span: AbstractSpan | None = None if span is None: if self.auto_instrument: entered_span = stack.enter_context(logfire_span('run graph {graph.name}', graph=self)) else: entered_span = stack.enter_context(span) traceparent = None if entered_span is None else get_traceparent(entered_span) yield GraphRun[StateT, DepsT, RunEndT]( graph=self, start_node=start_node, persistence=persistence, state=state, deps=deps, traceparent=traceparent, ) ``` #### iter_from_persistence ```python iter_from_persistence( persistence: BaseStatePersistence[StateT, RunEndT], *, deps: DepsT = None, span: ( AbstractContextManager[AbstractSpan] | None ) = None, infer_name: bool = True ) -> AsyncIterator[GraphRun[StateT, DepsT, RunEndT]] ``` A contextmanager to iterate over the graph's nodes as they are executed, created from a persistence object. This method has similar functionality to iter, but instead of passing the node to run, it will restore the node and state from state persistence. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `persistence` | `BaseStatePersistence[StateT, RunEndT]` | The state persistence interface to use. | *required* | | `deps` | `DepsT` | The dependencies of the graph. | `None` | | `span` | `AbstractContextManager[AbstractSpan] | None` | The span to use for the graph run. If not provided, a new span will be created. | `None` | | `infer_name` | `bool` | Whether to infer the graph name from the calling frame. | `True` | Returns: A GraphRun that can be async iterated over to drive the graph to completion. Source code in `pydantic_graph/pydantic_graph/graph.py` ```python @asynccontextmanager async def iter_from_persistence( self, persistence: BaseStatePersistence[StateT, RunEndT], *, deps: DepsT = None, span: AbstractContextManager[AbstractSpan] | None = None, infer_name: bool = True, ) -> AsyncIterator[GraphRun[StateT, DepsT, RunEndT]]: """A contextmanager to iterate over the graph's nodes as they are executed, created from a persistence object. This method has similar functionality to [`iter`][pydantic_graph.graph.Graph.iter], but instead of passing the node to run, it will restore the node and state from state persistence. Args: persistence: The state persistence interface to use. deps: The dependencies of the graph. span: The span to use for the graph run. If not provided, a new span will be created. infer_name: Whether to infer the graph name from the calling frame. Returns: A GraphRun that can be async iterated over to drive the graph to completion. """ if infer_name and self.name is None: # f_back because `asynccontextmanager` adds one frame if frame := inspect.currentframe(): # pragma: no branch self._infer_name(frame.f_back) persistence.set_graph_types(self) snapshot = await persistence.load_next() if snapshot is None: raise exceptions.GraphRuntimeError('Unable to restore snapshot from state persistence.') snapshot.node.set_snapshot_id(snapshot.id) if self.auto_instrument and span is None: # pragma: no branch span = logfire_span('run graph {graph.name}', graph=self) with ExitStack() as stack: entered_span = None if span is None else stack.enter_context(span) traceparent = None if entered_span is None else get_traceparent(entered_span) yield GraphRun[StateT, DepsT, RunEndT]( graph=self, start_node=snapshot.node, persistence=persistence, state=snapshot.state, deps=deps, snapshot_id=snapshot.id, traceparent=traceparent, ) ``` #### initialize ```python initialize( node: BaseNode[StateT, DepsT, RunEndT], persistence: BaseStatePersistence[StateT, RunEndT], *, state: StateT = None, infer_name: bool = True ) -> None ``` Initialize a new graph run in persistence without running it. This is useful if you want to set up a graph run to be run later, e.g. via iter_from_persistence. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `node` | `BaseNode[StateT, DepsT, RunEndT]` | The node to run first. | *required* | | `persistence` | `BaseStatePersistence[StateT, RunEndT]` | State persistence interface. | *required* | | `state` | `StateT` | The start state of the graph. | `None` | | `infer_name` | `bool` | Whether to infer the graph name from the calling frame. | `True` | Source code in `pydantic_graph/pydantic_graph/graph.py` ```python async def initialize( self, node: BaseNode[StateT, DepsT, RunEndT], persistence: BaseStatePersistence[StateT, RunEndT], *, state: StateT = None, infer_name: bool = True, ) -> None: """Initialize a new graph run in persistence without running it. This is useful if you want to set up a graph run to be run later, e.g. via [`iter_from_persistence`][pydantic_graph.graph.Graph.iter_from_persistence]. Args: node: The node to run first. persistence: State persistence interface. state: The start state of the graph. infer_name: Whether to infer the graph name from the calling frame. """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) persistence.set_graph_types(self) await persistence.snapshot_node(state, node) ``` #### mermaid_code ```python mermaid_code( *, start_node: ( Sequence[NodeIdent] | NodeIdent | None ) = None, title: str | None | Literal[False] = None, edge_labels: bool = True, notes: bool = True, highlighted_nodes: ( Sequence[NodeIdent] | NodeIdent | None ) = None, highlight_css: str = DEFAULT_HIGHLIGHT_CSS, infer_name: bool = True, direction: StateDiagramDirection | None = None ) -> str ``` Generate a diagram representing the graph as [mermaid](https://mermaid.js.org/) diagram. This method calls pydantic_graph.mermaid.generate_code. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `start_node` | `Sequence[NodeIdent] | NodeIdent | None` | The node or nodes which can start the graph. | `None` | | `title` | `str | None | Literal[False]` | The title of the diagram, use False to not include a title. | `None` | | `edge_labels` | `bool` | Whether to include edge labels. | `True` | | `notes` | `bool` | Whether to include notes on each node. | `True` | | `highlighted_nodes` | `Sequence[NodeIdent] | NodeIdent | None` | Optional node or nodes to highlight. | `None` | | `highlight_css` | `str` | The CSS to use for highlighting nodes. | `DEFAULT_HIGHLIGHT_CSS` | | `infer_name` | `bool` | Whether to infer the graph name from the calling frame. | `True` | | `direction` | `StateDiagramDirection | None` | The direction of flow. | `None` | Returns: | Type | Description | | --- | --- | | `str` | The mermaid code for the graph, which can then be rendered as a diagram. | Here's an example of generating a diagram for the graph from above: mermaid_never_42.py ```py from never_42 import Increment, never_42_graph print(never_42_graph.mermaid_code(start_node=Increment)) ''' --- title: never_42_graph --- stateDiagram-v2 [*] --> Increment Increment --> Check42 Check42 --> Increment Check42 --> [*] ''' ``` The rendered diagram will look like this: ``` --- title: never_42_graph --- stateDiagram-v2 [*] --> Increment Increment --> Check42 Check42 --> Increment Check42 --> [*] ``` Source code in `pydantic_graph/pydantic_graph/graph.py` ````python def mermaid_code( self, *, start_node: Sequence[mermaid.NodeIdent] | mermaid.NodeIdent | None = None, title: str | None | typing_extensions.Literal[False] = None, edge_labels: bool = True, notes: bool = True, highlighted_nodes: Sequence[mermaid.NodeIdent] | mermaid.NodeIdent | None = None, highlight_css: str = mermaid.DEFAULT_HIGHLIGHT_CSS, infer_name: bool = True, direction: mermaid.StateDiagramDirection | None = None, ) -> str: """Generate a diagram representing the graph as [mermaid](https://mermaid.js.org/) diagram. This method calls [`pydantic_graph.mermaid.generate_code`][pydantic_graph.mermaid.generate_code]. Args: start_node: The node or nodes which can start the graph. title: The title of the diagram, use `False` to not include a title. edge_labels: Whether to include edge labels. notes: Whether to include notes on each node. highlighted_nodes: Optional node or nodes to highlight. highlight_css: The CSS to use for highlighting nodes. infer_name: Whether to infer the graph name from the calling frame. direction: The direction of flow. Returns: The mermaid code for the graph, which can then be rendered as a diagram. Here's an example of generating a diagram for the graph from [above][pydantic_graph.graph.Graph]: ```py {title="mermaid_never_42.py" requires="never_42.py"} from never_42 import Increment, never_42_graph print(never_42_graph.mermaid_code(start_node=Increment)) ''' --- title: never_42_graph --- stateDiagram-v2 [*] --> Increment Increment --> Check42 Check42 --> Increment Check42 --> [*] ''' ``` The rendered diagram will look like this: ```mermaid --- title: never_42_graph --- stateDiagram-v2 [*] --> Increment Increment --> Check42 Check42 --> Increment Check42 --> [*] ``` """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) if title is None and self.name: title = self.name return mermaid.generate_code( self, start_node=start_node, highlighted_nodes=highlighted_nodes, highlight_css=highlight_css, title=title or None, edge_labels=edge_labels, notes=notes, direction=direction, ) ```` #### mermaid_image ```python mermaid_image( infer_name: bool = True, **kwargs: Unpack[MermaidConfig] ) -> bytes ``` Generate a diagram representing the graph as an image. The format and diagram can be customized using `kwargs`, see pydantic_graph.mermaid.MermaidConfig. Uses external service This method makes a request to [mermaid.ink](https://mermaid.ink) to render the image, `mermaid.ink` is a free service not affiliated with Pydantic. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `infer_name` | `bool` | Whether to infer the graph name from the calling frame. | `True` | | `**kwargs` | `Unpack[MermaidConfig]` | Additional arguments to pass to mermaid.request_image. | `{}` | Returns: | Type | Description | | --- | --- | | `bytes` | The image bytes. | Source code in `pydantic_graph/pydantic_graph/graph.py` ```python def mermaid_image( self, infer_name: bool = True, **kwargs: typing_extensions.Unpack[mermaid.MermaidConfig] ) -> bytes: """Generate a diagram representing the graph as an image. The format and diagram can be customized using `kwargs`, see [`pydantic_graph.mermaid.MermaidConfig`][pydantic_graph.mermaid.MermaidConfig]. !!! note "Uses external service" This method makes a request to [mermaid.ink](https://mermaid.ink) to render the image, `mermaid.ink` is a free service not affiliated with Pydantic. Args: infer_name: Whether to infer the graph name from the calling frame. **kwargs: Additional arguments to pass to `mermaid.request_image`. Returns: The image bytes. """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) if 'title' not in kwargs and self.name: kwargs['title'] = self.name return mermaid.request_image(self, **kwargs) ``` #### mermaid_save ```python mermaid_save( path: Path | str, /, *, infer_name: bool = True, **kwargs: Unpack[MermaidConfig], ) -> None ``` Generate a diagram representing the graph and save it as an image. The format and diagram can be customized using `kwargs`, see pydantic_graph.mermaid.MermaidConfig. Uses external service This method makes a request to [mermaid.ink](https://mermaid.ink) to render the image, `mermaid.ink` is a free service not affiliated with Pydantic. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `path` | `Path | str` | The path to save the image to. | *required* | | `infer_name` | `bool` | Whether to infer the graph name from the calling frame. | `True` | | `**kwargs` | `Unpack[MermaidConfig]` | Additional arguments to pass to mermaid.save_image. | `{}` | Source code in `pydantic_graph/pydantic_graph/graph.py` ```python def mermaid_save( self, path: Path | str, /, *, infer_name: bool = True, **kwargs: typing_extensions.Unpack[mermaid.MermaidConfig] ) -> None: """Generate a diagram representing the graph and save it as an image. The format and diagram can be customized using `kwargs`, see [`pydantic_graph.mermaid.MermaidConfig`][pydantic_graph.mermaid.MermaidConfig]. !!! note "Uses external service" This method makes a request to [mermaid.ink](https://mermaid.ink) to render the image, `mermaid.ink` is a free service not affiliated with Pydantic. Args: path: The path to save the image to. infer_name: Whether to infer the graph name from the calling frame. **kwargs: Additional arguments to pass to `mermaid.save_image`. """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) if 'title' not in kwargs and self.name: kwargs['title'] = self.name mermaid.save_image(path, self, **kwargs) ``` #### get_nodes ```python get_nodes() -> ( Sequence[type[BaseNode[StateT, DepsT, RunEndT]]] ) ``` Get the nodes in the graph. Source code in `pydantic_graph/pydantic_graph/graph.py` ```python def get_nodes(self) -> Sequence[type[BaseNode[StateT, DepsT, RunEndT]]]: """Get the nodes in the graph.""" return [node_def.node for node_def in self.node_defs.values()] ``` ### GraphRun Bases: `Generic[StateT, DepsT, RunEndT]` A stateful, async-iterable run of a Graph. You typically get a `GraphRun` instance from calling `async with [my_graph.iter(...)][pydantic_graph.graph.Graph.iter] as graph_run:`. That gives you the ability to iterate through nodes as they run, either by `async for` iteration or by repeatedly calling `.next(...)`. Here's an example of iterating over the graph from above: iter_never_42.py ```py from copy import deepcopy from never_42 import Increment, MyState, never_42_graph async def main(): state = MyState(1) async with never_42_graph.iter(Increment(), state=state) as graph_run: node_states = [(graph_run.next_node, deepcopy(graph_run.state))] async for node in graph_run: node_states.append((node, deepcopy(graph_run.state))) print(node_states) ''' [ (Increment(), MyState(number=1)), (Increment(), MyState(number=1)), (Check42(), MyState(number=2)), (End(data=2), MyState(number=2)), ] ''' state = MyState(41) async with never_42_graph.iter(Increment(), state=state) as graph_run: node_states = [(graph_run.next_node, deepcopy(graph_run.state))] async for node in graph_run: node_states.append((node, deepcopy(graph_run.state))) print(node_states) ''' [ (Increment(), MyState(number=41)), (Increment(), MyState(number=41)), (Check42(), MyState(number=42)), (Increment(), MyState(number=42)), (Check42(), MyState(number=43)), (End(data=43), MyState(number=43)), ] ''' ``` See the GraphRun.next documentation for an example of how to manually drive the graph run. Source code in `pydantic_graph/pydantic_graph/graph.py` ````python class GraphRun(Generic[StateT, DepsT, RunEndT]): """A stateful, async-iterable run of a [`Graph`][pydantic_graph.graph.Graph]. You typically get a `GraphRun` instance from calling `async with [my_graph.iter(...)][pydantic_graph.graph.Graph.iter] as graph_run:`. That gives you the ability to iterate through nodes as they run, either by `async for` iteration or by repeatedly calling `.next(...)`. Here's an example of iterating over the graph from [above][pydantic_graph.graph.Graph]: ```py {title="iter_never_42.py" noqa="I001" requires="never_42.py"} from copy import deepcopy from never_42 import Increment, MyState, never_42_graph async def main(): state = MyState(1) async with never_42_graph.iter(Increment(), state=state) as graph_run: node_states = [(graph_run.next_node, deepcopy(graph_run.state))] async for node in graph_run: node_states.append((node, deepcopy(graph_run.state))) print(node_states) ''' [ (Increment(), MyState(number=1)), (Increment(), MyState(number=1)), (Check42(), MyState(number=2)), (End(data=2), MyState(number=2)), ] ''' state = MyState(41) async with never_42_graph.iter(Increment(), state=state) as graph_run: node_states = [(graph_run.next_node, deepcopy(graph_run.state))] async for node in graph_run: node_states.append((node, deepcopy(graph_run.state))) print(node_states) ''' [ (Increment(), MyState(number=41)), (Increment(), MyState(number=41)), (Check42(), MyState(number=42)), (Increment(), MyState(number=42)), (Check42(), MyState(number=43)), (End(data=43), MyState(number=43)), ] ''' ``` See the [`GraphRun.next` documentation][pydantic_graph.graph.GraphRun.next] for an example of how to manually drive the graph run. """ def __init__( self, *, graph: Graph[StateT, DepsT, RunEndT], start_node: BaseNode[StateT, DepsT, RunEndT], persistence: BaseStatePersistence[StateT, RunEndT], state: StateT, deps: DepsT, traceparent: str | None, snapshot_id: str | None = None, ): """Create a new run for a given graph, starting at the specified node. Typically, you'll use [`Graph.iter`][pydantic_graph.graph.Graph.iter] rather than calling this directly. Args: graph: The [`Graph`][pydantic_graph.graph.Graph] to run. start_node: The node where execution will begin. persistence: State persistence interface. state: A shared state object or primitive (like a counter, dataclass, etc.) that is available to all nodes via `ctx.state`. deps: Optional dependencies that each node can access via `ctx.deps`, e.g. database connections, configuration, or logging clients. traceparent: The traceparent for the span used for the graph run. snapshot_id: The ID of the snapshot the node came from. """ self.graph = graph self.persistence = persistence self._snapshot_id: str | None = snapshot_id self.state = state self.deps = deps self.__traceparent = traceparent self._next_node: BaseNode[StateT, DepsT, RunEndT] | End[RunEndT] = start_node self._is_started: bool = False @overload def _traceparent(self, *, required: typing_extensions.Literal[False]) -> str | None: ... @overload def _traceparent(self) -> str: ... def _traceparent(self, *, required: bool = True) -> str | None: if self.__traceparent is None and required: # pragma: no cover raise exceptions.GraphRuntimeError('No span was created for this graph run') return self.__traceparent @property def next_node(self) -> BaseNode[StateT, DepsT, RunEndT] | End[RunEndT]: """The next node that will be run in the graph. This is the next node that will be used during async iteration, or if a node is not passed to `self.next(...)`. """ return self._next_node @property def result(self) -> GraphRunResult[StateT, RunEndT] | None: """The final result of the graph run if the run is completed, otherwise `None`.""" if not isinstance(self._next_node, End): return None # The GraphRun has not finished running return GraphRunResult[StateT, RunEndT]( self._next_node.data, state=self.state, persistence=self.persistence, traceparent=self._traceparent(required=False), ) async def next( self, node: BaseNode[StateT, DepsT, RunEndT] | None = None ) -> BaseNode[StateT, DepsT, RunEndT] | End[RunEndT]: """Manually drive the graph run by passing in the node you want to run next. This lets you inspect or mutate the node before continuing execution, or skip certain nodes under dynamic conditions. The graph run should stop when you return an [`End`][pydantic_graph.nodes.End] node. Here's an example of using `next` to drive the graph from [above][pydantic_graph.graph.Graph]: ```py {title="next_never_42.py" noqa="I001" requires="never_42.py"} from copy import deepcopy from pydantic_graph import End from never_42 import Increment, MyState, never_42_graph async def main(): state = MyState(48) async with never_42_graph.iter(Increment(), state=state) as graph_run: next_node = graph_run.next_node # start with the first node node_states = [(next_node, deepcopy(graph_run.state))] while not isinstance(next_node, End): if graph_run.state.number == 50: graph_run.state.number = 42 next_node = await graph_run.next(next_node) node_states.append((next_node, deepcopy(graph_run.state))) print(node_states) ''' [ (Increment(), MyState(number=48)), (Check42(), MyState(number=49)), (End(data=49), MyState(number=49)), ] ''' ``` Args: node: The node to run next in the graph. If not specified, uses `self.next_node`, which is initialized to the `start_node` of the run and updated each time a new node is returned. Returns: The next node returned by the graph logic, or an [`End`][pydantic_graph.nodes.End] node if the run has completed. """ if node is None: # This cast is necessary because self._next_node could be an `End`. You'll get a runtime error if that's # the case, but if it is, the only way to get there would be to have tried calling next manually after # the run finished. Either way, maybe it would be better to not do this cast... node = cast(BaseNode[StateT, DepsT, RunEndT], self._next_node) node_snapshot_id = node.get_snapshot_id() else: node_snapshot_id = node.get_snapshot_id() if node_snapshot_id != self._snapshot_id: await self.persistence.snapshot_node_if_new(node_snapshot_id, self.state, node) self._snapshot_id = node_snapshot_id if not isinstance(node, BaseNode): # While technically this is not compatible with the documented method signature, it's an easy mistake to # make, and we should eagerly provide a more helpful error message than you'd get otherwise. raise TypeError(f'`next` must be called with a `BaseNode` instance, got {node!r}.') node_id = node.get_node_id() if node_id not in self.graph.node_defs: raise exceptions.GraphRuntimeError(f'Node `{node}` is not in the graph.') with ExitStack() as stack: if self.graph.auto_instrument: stack.enter_context(logfire_span('run node {node_id}', node_id=node_id, node=node)) async with self.persistence.record_run(node_snapshot_id): ctx = GraphRunContext(state=self.state, deps=self.deps) self._next_node = await node.run(ctx) if isinstance(self._next_node, End): self._snapshot_id = self._next_node.get_snapshot_id() await self.persistence.snapshot_end(self.state, self._next_node) elif isinstance(self._next_node, BaseNode): self._snapshot_id = self._next_node.get_snapshot_id() await self.persistence.snapshot_node(self.state, self._next_node) else: raise exceptions.GraphRuntimeError( f'Invalid node return type: `{type(self._next_node).__name__}`. Expected `BaseNode` or `End`.' ) return self._next_node def __aiter__(self) -> AsyncIterator[BaseNode[StateT, DepsT, RunEndT] | End[RunEndT]]: return self async def __anext__(self) -> BaseNode[StateT, DepsT, RunEndT] | End[RunEndT]: """Use the last returned node as the input to `Graph.next`.""" if not self._is_started: self._is_started = True return self._next_node if isinstance(self._next_node, End): raise StopAsyncIteration return await self.next(self._next_node) def __repr__(self) -> str: return f'' ```` #### __init__ ```python __init__( *, graph: Graph[StateT, DepsT, RunEndT], start_node: BaseNode[StateT, DepsT, RunEndT], persistence: BaseStatePersistence[StateT, RunEndT], state: StateT, deps: DepsT, traceparent: str | None, snapshot_id: str | None = None ) ``` Create a new run for a given graph, starting at the specified node. Typically, you'll use Graph.iter rather than calling this directly. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `graph` | `Graph[StateT, DepsT, RunEndT]` | The Graph to run. | *required* | | `start_node` | `BaseNode[StateT, DepsT, RunEndT]` | The node where execution will begin. | *required* | | `persistence` | `BaseStatePersistence[StateT, RunEndT]` | State persistence interface. | *required* | | `state` | `StateT` | A shared state object or primitive (like a counter, dataclass, etc.) that is available to all nodes via ctx.state. | *required* | | `deps` | `DepsT` | Optional dependencies that each node can access via ctx.deps, e.g. database connections, configuration, or logging clients. | *required* | | `traceparent` | `str | None` | The traceparent for the span used for the graph run. | *required* | | `snapshot_id` | `str | None` | The ID of the snapshot the node came from. | `None` | Source code in `pydantic_graph/pydantic_graph/graph.py` ```python def __init__( self, *, graph: Graph[StateT, DepsT, RunEndT], start_node: BaseNode[StateT, DepsT, RunEndT], persistence: BaseStatePersistence[StateT, RunEndT], state: StateT, deps: DepsT, traceparent: str | None, snapshot_id: str | None = None, ): """Create a new run for a given graph, starting at the specified node. Typically, you'll use [`Graph.iter`][pydantic_graph.graph.Graph.iter] rather than calling this directly. Args: graph: The [`Graph`][pydantic_graph.graph.Graph] to run. start_node: The node where execution will begin. persistence: State persistence interface. state: A shared state object or primitive (like a counter, dataclass, etc.) that is available to all nodes via `ctx.state`. deps: Optional dependencies that each node can access via `ctx.deps`, e.g. database connections, configuration, or logging clients. traceparent: The traceparent for the span used for the graph run. snapshot_id: The ID of the snapshot the node came from. """ self.graph = graph self.persistence = persistence self._snapshot_id: str | None = snapshot_id self.state = state self.deps = deps self.__traceparent = traceparent self._next_node: BaseNode[StateT, DepsT, RunEndT] | End[RunEndT] = start_node self._is_started: bool = False ``` #### next_node ```python next_node: BaseNode[StateT, DepsT, RunEndT] | End[RunEndT] ``` The next node that will be run in the graph. This is the next node that will be used during async iteration, or if a node is not passed to `self.next(...)`. #### result ```python result: GraphRunResult[StateT, RunEndT] | None ``` The final result of the graph run if the run is completed, otherwise `None`. #### next ```python next( node: BaseNode[StateT, DepsT, RunEndT] | None = None, ) -> BaseNode[StateT, DepsT, RunEndT] | End[RunEndT] ``` Manually drive the graph run by passing in the node you want to run next. This lets you inspect or mutate the node before continuing execution, or skip certain nodes under dynamic conditions. The graph run should stop when you return an End node. Here's an example of using `next` to drive the graph from above: next_never_42.py ```py from copy import deepcopy from pydantic_graph import End from never_42 import Increment, MyState, never_42_graph async def main(): state = MyState(48) async with never_42_graph.iter(Increment(), state=state) as graph_run: next_node = graph_run.next_node # start with the first node node_states = [(next_node, deepcopy(graph_run.state))] while not isinstance(next_node, End): if graph_run.state.number == 50: graph_run.state.number = 42 next_node = await graph_run.next(next_node) node_states.append((next_node, deepcopy(graph_run.state))) print(node_states) ''' [ (Increment(), MyState(number=48)), (Check42(), MyState(number=49)), (End(data=49), MyState(number=49)), ] ''' ``` Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `node` | `BaseNode[StateT, DepsT, RunEndT] | None` | The node to run next in the graph. If not specified, uses self.next_node, which is initialized to the start_node of the run and updated each time a new node is returned. | `None` | Returns: | Type | Description | | --- | --- | | `BaseNode[StateT, DepsT, RunEndT] | End[RunEndT]` | The next node returned by the graph logic, or an End node if | | `BaseNode[StateT, DepsT, RunEndT] | End[RunEndT]` | the run has completed. | Source code in `pydantic_graph/pydantic_graph/graph.py` ````python async def next( self, node: BaseNode[StateT, DepsT, RunEndT] | None = None ) -> BaseNode[StateT, DepsT, RunEndT] | End[RunEndT]: """Manually drive the graph run by passing in the node you want to run next. This lets you inspect or mutate the node before continuing execution, or skip certain nodes under dynamic conditions. The graph run should stop when you return an [`End`][pydantic_graph.nodes.End] node. Here's an example of using `next` to drive the graph from [above][pydantic_graph.graph.Graph]: ```py {title="next_never_42.py" noqa="I001" requires="never_42.py"} from copy import deepcopy from pydantic_graph import End from never_42 import Increment, MyState, never_42_graph async def main(): state = MyState(48) async with never_42_graph.iter(Increment(), state=state) as graph_run: next_node = graph_run.next_node # start with the first node node_states = [(next_node, deepcopy(graph_run.state))] while not isinstance(next_node, End): if graph_run.state.number == 50: graph_run.state.number = 42 next_node = await graph_run.next(next_node) node_states.append((next_node, deepcopy(graph_run.state))) print(node_states) ''' [ (Increment(), MyState(number=48)), (Check42(), MyState(number=49)), (End(data=49), MyState(number=49)), ] ''' ``` Args: node: The node to run next in the graph. If not specified, uses `self.next_node`, which is initialized to the `start_node` of the run and updated each time a new node is returned. Returns: The next node returned by the graph logic, or an [`End`][pydantic_graph.nodes.End] node if the run has completed. """ if node is None: # This cast is necessary because self._next_node could be an `End`. You'll get a runtime error if that's # the case, but if it is, the only way to get there would be to have tried calling next manually after # the run finished. Either way, maybe it would be better to not do this cast... node = cast(BaseNode[StateT, DepsT, RunEndT], self._next_node) node_snapshot_id = node.get_snapshot_id() else: node_snapshot_id = node.get_snapshot_id() if node_snapshot_id != self._snapshot_id: await self.persistence.snapshot_node_if_new(node_snapshot_id, self.state, node) self._snapshot_id = node_snapshot_id if not isinstance(node, BaseNode): # While technically this is not compatible with the documented method signature, it's an easy mistake to # make, and we should eagerly provide a more helpful error message than you'd get otherwise. raise TypeError(f'`next` must be called with a `BaseNode` instance, got {node!r}.') node_id = node.get_node_id() if node_id not in self.graph.node_defs: raise exceptions.GraphRuntimeError(f'Node `{node}` is not in the graph.') with ExitStack() as stack: if self.graph.auto_instrument: stack.enter_context(logfire_span('run node {node_id}', node_id=node_id, node=node)) async with self.persistence.record_run(node_snapshot_id): ctx = GraphRunContext(state=self.state, deps=self.deps) self._next_node = await node.run(ctx) if isinstance(self._next_node, End): self._snapshot_id = self._next_node.get_snapshot_id() await self.persistence.snapshot_end(self.state, self._next_node) elif isinstance(self._next_node, BaseNode): self._snapshot_id = self._next_node.get_snapshot_id() await self.persistence.snapshot_node(self.state, self._next_node) else: raise exceptions.GraphRuntimeError( f'Invalid node return type: `{type(self._next_node).__name__}`. Expected `BaseNode` or `End`.' ) return self._next_node ```` #### __anext__ ```python __anext__() -> ( BaseNode[StateT, DepsT, RunEndT] | End[RunEndT] ) ``` Use the last returned node as the input to `Graph.next`. Source code in `pydantic_graph/pydantic_graph/graph.py` ```python async def __anext__(self) -> BaseNode[StateT, DepsT, RunEndT] | End[RunEndT]: """Use the last returned node as the input to `Graph.next`.""" if not self._is_started: self._is_started = True return self._next_node if isinstance(self._next_node, End): raise StopAsyncIteration return await self.next(self._next_node) ``` ### GraphRunResult Bases: `Generic[StateT, RunEndT]` The final result of running a graph. Source code in `pydantic_graph/pydantic_graph/graph.py` ```python @dataclass(init=False) class GraphRunResult(Generic[StateT, RunEndT]): """The final result of running a graph.""" output: RunEndT state: StateT persistence: BaseStatePersistence[StateT, RunEndT] = field(repr=False) def __init__( self, output: RunEndT, state: StateT, persistence: BaseStatePersistence[StateT, RunEndT], traceparent: str | None = None, ): self.output = output self.state = state self.persistence = persistence self.__traceparent = traceparent @overload def _traceparent(self, *, required: typing_extensions.Literal[False]) -> str | None: ... @overload def _traceparent(self) -> str: ... def _traceparent(self, *, required: bool = True) -> str | None: # pragma: no cover if self.__traceparent is None and required: raise exceptions.GraphRuntimeError('No span was created for this graph run.') return self.__traceparent ``` # `pydantic_graph.mermaid` ### DEFAULT_HIGHLIGHT_CSS ```python DEFAULT_HIGHLIGHT_CSS = 'fill:#fdff32' ``` The default CSS to use for highlighting nodes. ### StateDiagramDirection ```python StateDiagramDirection = Literal['TB', 'LR', 'RL', 'BT'] ``` Used to specify the direction of the state diagram generated by mermaid. - `'TB'`: Top to bottom, this is the default for mermaid charts. - `'LR'`: Left to right - `'RL'`: Right to left - `'BT'`: Bottom to top ### generate_code ```python generate_code( graph: Graph[Any, Any, Any], /, *, start_node: ( Sequence[NodeIdent] | NodeIdent | None ) = None, highlighted_nodes: ( Sequence[NodeIdent] | NodeIdent | None ) = None, highlight_css: str = DEFAULT_HIGHLIGHT_CSS, title: str | None = None, edge_labels: bool = True, notes: bool = True, direction: StateDiagramDirection | None, ) -> str ``` Generate [Mermaid state diagram](https://mermaid.js.org/syntax/stateDiagram.html) code for a graph. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `graph` | `Graph[Any, Any, Any]` | The graph to generate the image for. | *required* | | `start_node` | `Sequence[NodeIdent] | NodeIdent | None` | Identifiers of nodes that start the graph. | `None` | | `highlighted_nodes` | `Sequence[NodeIdent] | NodeIdent | None` | Identifiers of nodes to highlight. | `None` | | `highlight_css` | `str` | CSS to use for highlighting nodes. | `DEFAULT_HIGHLIGHT_CSS` | | `title` | `str | None` | The title of the diagram. | `None` | | `edge_labels` | `bool` | Whether to include edge labels in the diagram. | `True` | | `notes` | `bool` | Whether to include notes in the diagram. | `True` | | `direction` | `StateDiagramDirection | None` | The direction of flow. | *required* | Returns: | Type | Description | | --- | --- | | `str` | The Mermaid code for the graph. | Source code in `pydantic_graph/pydantic_graph/mermaid.py` ```python def generate_code( # noqa: C901 graph: Graph[Any, Any, Any], /, *, start_node: Sequence[NodeIdent] | NodeIdent | None = None, highlighted_nodes: Sequence[NodeIdent] | NodeIdent | None = None, highlight_css: str = DEFAULT_HIGHLIGHT_CSS, title: str | None = None, edge_labels: bool = True, notes: bool = True, direction: StateDiagramDirection | None, ) -> str: """Generate [Mermaid state diagram](https://mermaid.js.org/syntax/stateDiagram.html) code for a graph. Args: graph: The graph to generate the image for. start_node: Identifiers of nodes that start the graph. highlighted_nodes: Identifiers of nodes to highlight. highlight_css: CSS to use for highlighting nodes. title: The title of the diagram. edge_labels: Whether to include edge labels in the diagram. notes: Whether to include notes in the diagram. direction: The direction of flow. Returns: The Mermaid code for the graph. """ start_node_ids = set(_node_ids(start_node or ())) for node_id in start_node_ids: if node_id not in graph.node_defs: raise LookupError(f'Start node "{node_id}" is not in the graph.') lines: list[str] = [] if title: lines = ['---', f'title: {title}', '---'] lines.append('stateDiagram-v2') if direction is not None: lines.append(f' direction {direction}') for node_id, node_def in graph.node_defs.items(): # we use round brackets (rounded box) for nodes other than the start and end if node_id in start_node_ids: lines.append(f' [*] --> {node_id}') if node_def.returns_base_node: for next_node_id in graph.node_defs: lines.append(f' {node_id} --> {next_node_id}') else: for next_node_id, edge in node_def.next_node_edges.items(): line = f' {node_id} --> {next_node_id}' if edge_labels and edge.label: line += f': {edge.label}' lines.append(line) if end_edge := node_def.end_edge: line = f' {node_id} --> [*]' if edge_labels and end_edge.label: line += f': {end_edge.label}' lines.append(line) if notes and node_def.note: lines.append(f' note right of {node_id}') # mermaid doesn't like multiple paragraphs in a note, and shows if so clean_docs = re.sub('\n{2,}', '\n', node_def.note) lines.append(indent(clean_docs, ' ')) lines.append(' end note') if highlighted_nodes: lines.append('') lines.append(f'classDef highlighted {highlight_css}') for node_id in _node_ids(highlighted_nodes): if node_id not in graph.node_defs: raise LookupError(f'Highlighted node "{node_id}" is not in the graph.') lines.append(f'class {node_id} highlighted') return '\n'.join(lines) ``` ### request_image ```python request_image( graph: Graph[Any, Any, Any], /, **kwargs: Unpack[MermaidConfig], ) -> bytes ``` Generate an image of a Mermaid diagram using [mermaid.ink](https://mermaid.ink). Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `graph` | `Graph[Any, Any, Any]` | The graph to generate the image for. | *required* | | `**kwargs` | `Unpack[MermaidConfig]` | Additional parameters to configure mermaid chart generation. | `{}` | Returns: | Type | Description | | --- | --- | | `bytes` | The image data. | Source code in `pydantic_graph/pydantic_graph/mermaid.py` ```python def request_image( graph: Graph[Any, Any, Any], /, **kwargs: Unpack[MermaidConfig], ) -> bytes: """Generate an image of a Mermaid diagram using [mermaid.ink](https://mermaid.ink). Args: graph: The graph to generate the image for. **kwargs: Additional parameters to configure mermaid chart generation. Returns: The image data. """ code = generate_code( graph, start_node=kwargs.get('start_node'), highlighted_nodes=kwargs.get('highlighted_nodes'), highlight_css=kwargs.get('highlight_css', DEFAULT_HIGHLIGHT_CSS), title=kwargs.get('title'), edge_labels=kwargs.get('edge_labels', True), notes=kwargs.get('notes', True), direction=kwargs.get('direction'), ) code_base64 = base64.b64encode(code.encode()).decode() params: dict[str, str | float] = {} if kwargs.get('image_type') == 'pdf': url = f'https://mermaid.ink/pdf/{code_base64}' if kwargs.get('pdf_fit'): params['fit'] = '' if kwargs.get('pdf_landscape'): params['landscape'] = '' if pdf_paper := kwargs.get('pdf_paper'): params['paper'] = pdf_paper elif kwargs.get('image_type') == 'svg': url = f'https://mermaid.ink/svg/{code_base64}' else: url = f'https://mermaid.ink/img/{code_base64}' if image_type := kwargs.get('image_type'): params['type'] = image_type if background_color := kwargs.get('background_color'): params['bgColor'] = background_color if theme := kwargs.get('theme'): params['theme'] = theme if width := kwargs.get('width'): params['width'] = width if height := kwargs.get('height'): params['height'] = height if scale := kwargs.get('scale'): params['scale'] = scale httpx_client = kwargs.get('httpx_client') or httpx.Client() response = httpx_client.get(url, params=params) if not response.is_success: raise httpx.HTTPStatusError( f'{response.status_code} error generating image:\n{response.text}', request=response.request, response=response, ) return response.content ``` ### save_image ```python save_image( path: Path | str, graph: Graph[Any, Any, Any], /, **kwargs: Unpack[MermaidConfig], ) -> None ``` Generate an image of a Mermaid diagram using [mermaid.ink](https://mermaid.ink) and save it to a local file. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `path` | `Path | str` | The path to save the image to. | *required* | | `graph` | `Graph[Any, Any, Any]` | The graph to generate the image for. | *required* | | `**kwargs` | `Unpack[MermaidConfig]` | Additional parameters to configure mermaid chart generation. | `{}` | Source code in `pydantic_graph/pydantic_graph/mermaid.py` ```python def save_image( path: Path | str, graph: Graph[Any, Any, Any], /, **kwargs: Unpack[MermaidConfig], ) -> None: """Generate an image of a Mermaid diagram using [mermaid.ink](https://mermaid.ink) and save it to a local file. Args: path: The path to save the image to. graph: The graph to generate the image for. **kwargs: Additional parameters to configure mermaid chart generation. """ if isinstance(path, str): path = Path(path) if 'image_type' not in kwargs: ext = path.suffix.lower()[1:] # no need to check for .jpeg/.jpg, as it is the default if ext in ('png', 'webp', 'svg', 'pdf'): kwargs['image_type'] = ext image_data = request_image(graph, **kwargs) path.write_bytes(image_data) ``` ### MermaidConfig Bases: `TypedDict` Parameters to configure mermaid chart generation. Source code in `pydantic_graph/pydantic_graph/mermaid.py` ```python class MermaidConfig(TypedDict, total=False): """Parameters to configure mermaid chart generation.""" start_node: Sequence[NodeIdent] | NodeIdent """Identifiers of nodes that start the graph.""" highlighted_nodes: Sequence[NodeIdent] | NodeIdent """Identifiers of nodes to highlight.""" highlight_css: str """CSS to use for highlighting nodes.""" title: str | None """The title of the diagram.""" edge_labels: bool """Whether to include edge labels in the diagram.""" notes: bool """Whether to include notes on nodes in the diagram, defaults to true.""" image_type: Literal['jpeg', 'png', 'webp', 'svg', 'pdf'] """The image type to generate. If unspecified, the default behavior is `'jpeg'`.""" pdf_fit: bool """When using image_type='pdf', whether to fit the diagram to the PDF page.""" pdf_landscape: bool """When using image_type='pdf', whether to use landscape orientation for the PDF. This has no effect if using `pdf_fit`. """ pdf_paper: Literal['letter', 'legal', 'tabloid', 'ledger', 'a0', 'a1', 'a2', 'a3', 'a4', 'a5', 'a6'] """When using image_type='pdf', the paper size of the PDF.""" background_color: str """The background color of the diagram. If None, the default transparent background is used. The color value is interpreted as a hexadecimal color code by default (and should not have a leading '#'), but you can also use named colors by prefixing the value with `'!'`. For example, valid choices include `background_color='!white'` or `background_color='FF0000'`. """ theme: Literal['default', 'neutral', 'dark', 'forest'] """The theme of the diagram. Defaults to 'default'.""" width: int """The width of the diagram.""" height: int """The height of the diagram.""" scale: Annotated[float, Ge(1), Le(3)] """The scale of the diagram. The scale must be a number between 1 and 3, and you can only set a scale if one or both of width and height are set. """ httpx_client: httpx.Client """An HTTPX client to use for requests, mostly for testing purposes.""" direction: StateDiagramDirection """The direction of the state diagram.""" ``` #### start_node ```python start_node: Sequence[NodeIdent] | NodeIdent ``` Identifiers of nodes that start the graph. #### highlighted_nodes ```python highlighted_nodes: Sequence[NodeIdent] | NodeIdent ``` Identifiers of nodes to highlight. #### highlight_css ```python highlight_css: str ``` CSS to use for highlighting nodes. #### title ```python title: str | None ``` The title of the diagram. #### edge_labels ```python edge_labels: bool ``` Whether to include edge labels in the diagram. #### notes ```python notes: bool ``` Whether to include notes on nodes in the diagram, defaults to true. #### image_type ```python image_type: Literal['jpeg', 'png', 'webp', 'svg', 'pdf'] ``` The image type to generate. If unspecified, the default behavior is `'jpeg'`. #### pdf_fit ```python pdf_fit: bool ``` When using image_type='pdf', whether to fit the diagram to the PDF page. #### pdf_landscape ```python pdf_landscape: bool ``` When using image_type='pdf', whether to use landscape orientation for the PDF. This has no effect if using `pdf_fit`. #### pdf_paper ```python pdf_paper: Literal[ "letter", "legal", "tabloid", "ledger", "a0", "a1", "a2", "a3", "a4", "a5", "a6", ] ``` When using image_type='pdf', the paper size of the PDF. #### background_color ```python background_color: str ``` The background color of the diagram. If None, the default transparent background is used. The color value is interpreted as a hexadecimal color code by default (and should not have a leading '#'), but you can also use named colors by prefixing the value with `'!'`. For example, valid choices include `background_color='!white'` or `background_color='FF0000'`. #### theme ```python theme: Literal['default', 'neutral', 'dark', 'forest'] ``` The theme of the diagram. Defaults to 'default'. #### width ```python width: int ``` The width of the diagram. #### height ```python height: int ``` The height of the diagram. #### scale ```python scale: Annotated[float, Ge(1), Le(3)] ``` The scale of the diagram. The scale must be a number between 1 and 3, and you can only set a scale if one or both of width and height are set. #### httpx_client ```python httpx_client: Client ``` An HTTPX client to use for requests, mostly for testing purposes. #### direction ```python direction: StateDiagramDirection ``` The direction of the state diagram. ### NodeIdent ```python NodeIdent: TypeAlias = ( "type[BaseNode[Any, Any, Any]] | BaseNode[Any, Any, Any] | str" ) ``` A type alias for a node identifier. This can be: - A node instance (instance of a subclass of BaseNode). - A node class (subclass of BaseNode). - A string representing the node ID. # `pydantic_graph.nodes` ### StateT ```python StateT = TypeVar('StateT', default=None) ``` Type variable for the state in a graph. ### GraphRunContext Bases: `Generic[StateT, DepsT]` Context for a graph. Source code in `pydantic_graph/pydantic_graph/nodes.py` ```python @dataclass(kw_only=True) class GraphRunContext(Generic[StateT, DepsT]): """Context for a graph.""" # TODO: Can we get rid of this struct and just pass both these things around..? state: StateT """The state of the graph.""" deps: DepsT """Dependencies for the graph.""" ``` #### state ```python state: StateT ``` The state of the graph. #### deps ```python deps: DepsT ``` Dependencies for the graph. ### BaseNode Bases: `ABC`, `Generic[StateT, DepsT, NodeRunEndT]` Base class for a node. Source code in `pydantic_graph/pydantic_graph/nodes.py` ```python class BaseNode(ABC, Generic[StateT, DepsT, NodeRunEndT]): """Base class for a node.""" docstring_notes: ClassVar[bool] = False """Set to `True` to generate mermaid diagram notes from the class's docstring. While this can add valuable information to the diagram, it can make diagrams harder to view, hence it is disabled by default. You can also customise notes overriding the [`get_note`][pydantic_graph.nodes.BaseNode.get_note] method. """ @abstractmethod async def run(self, ctx: GraphRunContext[StateT, DepsT]) -> BaseNode[StateT, DepsT, Any] | End[NodeRunEndT]: """Run the node. This is an abstract method that must be implemented by subclasses. !!! note "Return types used at runtime" The return type of this method are read by `pydantic_graph` at runtime and used to define which nodes can be called next in the graph. This is displayed in [mermaid diagrams](mermaid.md) and enforced when running the graph. Args: ctx: The graph context. Returns: The next node to run or [`End`][pydantic_graph.nodes.End] to signal the end of the graph. """ ... def get_snapshot_id(self) -> str: if snapshot_id := getattr(self, '__snapshot_id', None): return snapshot_id else: self.__dict__['__snapshot_id'] = snapshot_id = generate_snapshot_id(self.get_node_id()) return snapshot_id def set_snapshot_id(self, snapshot_id: str) -> None: self.__dict__['__snapshot_id'] = snapshot_id @classmethod @cache def get_node_id(cls) -> str: """Get the ID of the node.""" return cls.__name__ @classmethod def get_note(cls) -> str | None: """Get a note about the node to render on mermaid charts. By default, this returns a note only if [`docstring_notes`][pydantic_graph.nodes.BaseNode.docstring_notes] is `True`. You can override this method to customise the node notes. """ if not cls.docstring_notes: return None docstring = cls.__doc__ # dataclasses get an automatic docstring which is just their signature, we don't want that if docstring and is_dataclass(cls) and docstring.startswith(f'{cls.__name__}('): docstring = None # pragma: no cover if docstring: # pragma: no branch # remove indentation from docstring import inspect docstring = inspect.cleandoc(docstring) return docstring @classmethod def get_node_def(cls, local_ns: dict[str, Any] | None) -> NodeDef[StateT, DepsT, NodeRunEndT]: """Get the node definition.""" type_hints = get_type_hints(cls.run, localns=local_ns, include_extras=True) try: return_hint = type_hints['return'] except KeyError as e: raise exceptions.GraphSetupError(f'Node {cls} is missing a return type hint on its `run` method') from e next_node_edges: dict[str, Edge] = {} end_edge: Edge | None = None returns_base_node: bool = False for return_type in _utils.get_union_args(return_hint): return_type, annotations = _utils.unpack_annotated(return_type) edge = next((a for a in annotations if isinstance(a, Edge)), Edge(None)) return_type_origin = get_origin(return_type) or return_type if return_type_origin is End: end_edge = edge elif return_type_origin is BaseNode: returns_base_node = True elif issubclass(return_type_origin, BaseNode): next_node_edges[return_type.get_node_id()] = edge else: raise exceptions.GraphSetupError(f'Invalid return type: {return_type}') return NodeDef( node=cls, node_id=cls.get_node_id(), note=cls.get_note(), next_node_edges=next_node_edges, end_edge=end_edge, returns_base_node=returns_base_node, ) def deep_copy(self) -> Self: """Returns a deep copy of the node.""" return copy.deepcopy(self) ``` #### docstring_notes ```python docstring_notes: bool = False ``` Set to `True` to generate mermaid diagram notes from the class's docstring. While this can add valuable information to the diagram, it can make diagrams harder to view, hence it is disabled by default. You can also customise notes overriding the get_note method. #### run ```python run( ctx: GraphRunContext[StateT, DepsT], ) -> BaseNode[StateT, DepsT, Any] | End[NodeRunEndT] ``` Run the node. This is an abstract method that must be implemented by subclasses. Return types used at runtime The return type of this method are read by `pydantic_graph` at runtime and used to define which nodes can be called next in the graph. This is displayed in [mermaid diagrams](../mermaid/) and enforced when running the graph. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `ctx` | `GraphRunContext[StateT, DepsT]` | The graph context. | *required* | Returns: | Type | Description | | --- | --- | | `BaseNode[StateT, DepsT, Any] | End[NodeRunEndT]` | The next node to run or End to signal the end of the graph. | Source code in `pydantic_graph/pydantic_graph/nodes.py` ```python @abstractmethod async def run(self, ctx: GraphRunContext[StateT, DepsT]) -> BaseNode[StateT, DepsT, Any] | End[NodeRunEndT]: """Run the node. This is an abstract method that must be implemented by subclasses. !!! note "Return types used at runtime" The return type of this method are read by `pydantic_graph` at runtime and used to define which nodes can be called next in the graph. This is displayed in [mermaid diagrams](mermaid.md) and enforced when running the graph. Args: ctx: The graph context. Returns: The next node to run or [`End`][pydantic_graph.nodes.End] to signal the end of the graph. """ ... ``` #### get_node_id ```python get_node_id() -> str ``` Get the ID of the node. Source code in `pydantic_graph/pydantic_graph/nodes.py` ```python @classmethod @cache def get_node_id(cls) -> str: """Get the ID of the node.""" return cls.__name__ ``` #### get_note ```python get_note() -> str | None ``` Get a note about the node to render on mermaid charts. By default, this returns a note only if docstring_notes is `True`. You can override this method to customise the node notes. Source code in `pydantic_graph/pydantic_graph/nodes.py` ```python @classmethod def get_note(cls) -> str | None: """Get a note about the node to render on mermaid charts. By default, this returns a note only if [`docstring_notes`][pydantic_graph.nodes.BaseNode.docstring_notes] is `True`. You can override this method to customise the node notes. """ if not cls.docstring_notes: return None docstring = cls.__doc__ # dataclasses get an automatic docstring which is just their signature, we don't want that if docstring and is_dataclass(cls) and docstring.startswith(f'{cls.__name__}('): docstring = None # pragma: no cover if docstring: # pragma: no branch # remove indentation from docstring import inspect docstring = inspect.cleandoc(docstring) return docstring ``` #### get_node_def ```python get_node_def( local_ns: dict[str, Any] | None, ) -> NodeDef[StateT, DepsT, NodeRunEndT] ``` Get the node definition. Source code in `pydantic_graph/pydantic_graph/nodes.py` ```python @classmethod def get_node_def(cls, local_ns: dict[str, Any] | None) -> NodeDef[StateT, DepsT, NodeRunEndT]: """Get the node definition.""" type_hints = get_type_hints(cls.run, localns=local_ns, include_extras=True) try: return_hint = type_hints['return'] except KeyError as e: raise exceptions.GraphSetupError(f'Node {cls} is missing a return type hint on its `run` method') from e next_node_edges: dict[str, Edge] = {} end_edge: Edge | None = None returns_base_node: bool = False for return_type in _utils.get_union_args(return_hint): return_type, annotations = _utils.unpack_annotated(return_type) edge = next((a for a in annotations if isinstance(a, Edge)), Edge(None)) return_type_origin = get_origin(return_type) or return_type if return_type_origin is End: end_edge = edge elif return_type_origin is BaseNode: returns_base_node = True elif issubclass(return_type_origin, BaseNode): next_node_edges[return_type.get_node_id()] = edge else: raise exceptions.GraphSetupError(f'Invalid return type: {return_type}') return NodeDef( node=cls, node_id=cls.get_node_id(), note=cls.get_note(), next_node_edges=next_node_edges, end_edge=end_edge, returns_base_node=returns_base_node, ) ``` #### deep_copy ```python deep_copy() -> Self ``` Returns a deep copy of the node. Source code in `pydantic_graph/pydantic_graph/nodes.py` ```python def deep_copy(self) -> Self: """Returns a deep copy of the node.""" return copy.deepcopy(self) ``` ### End Bases: `Generic[RunEndT]` Type to return from a node to signal the end of the graph. Source code in `pydantic_graph/pydantic_graph/nodes.py` ```python @dataclass class End(Generic[RunEndT]): """Type to return from a node to signal the end of the graph.""" data: RunEndT """Data to return from the graph.""" def deep_copy_data(self) -> End[RunEndT]: """Returns a deep copy of the end of the run.""" if self.data is None: return self else: end = End(copy.deepcopy(self.data)) end.set_snapshot_id(self.get_snapshot_id()) return end def get_snapshot_id(self) -> str: if snapshot_id := getattr(self, '__snapshot_id', None): return snapshot_id else: self.__dict__['__snapshot_id'] = snapshot_id = generate_snapshot_id('end') return snapshot_id def set_snapshot_id(self, set_id: str) -> None: self.__dict__['__snapshot_id'] = set_id ``` #### data ```python data: RunEndT ``` Data to return from the graph. #### deep_copy_data ```python deep_copy_data() -> End[RunEndT] ``` Returns a deep copy of the end of the run. Source code in `pydantic_graph/pydantic_graph/nodes.py` ```python def deep_copy_data(self) -> End[RunEndT]: """Returns a deep copy of the end of the run.""" if self.data is None: return self else: end = End(copy.deepcopy(self.data)) end.set_snapshot_id(self.get_snapshot_id()) return end ``` ### Edge Annotation to apply a label to an edge in a graph. Source code in `pydantic_graph/pydantic_graph/nodes.py` ```python @dataclass(frozen=True) class Edge: """Annotation to apply a label to an edge in a graph.""" label: str | None """Label for the edge.""" ``` #### label ```python label: str | None ``` Label for the edge. ### DepsT ```python DepsT = TypeVar('DepsT', default=None, contravariant=True) ``` Type variable for the dependencies of a graph and node. ### RunEndT ```python RunEndT = TypeVar('RunEndT', covariant=True, default=None) ``` Covariant type variable for the return type of a graph run. ### NodeRunEndT ```python NodeRunEndT = TypeVar( "NodeRunEndT", covariant=True, default=Never ) ``` Covariant type variable for the return type of a node run. # `pydantic_graph.persistence` ### SnapshotStatus ```python SnapshotStatus = Literal[ "created", "pending", "running", "success", "error" ] ``` The status of a snapshot. - `'created'`: The snapshot has been created but not yet run. - `'pending'`: The snapshot has been retrieved with load_next but not yet run. - `'running'`: The snapshot is currently running. - `'success'`: The snapshot has been run successfully. - `'error'`: The snapshot has been run but an error occurred. ### NodeSnapshot Bases: `Generic[StateT, RunEndT]` History step describing the execution of a node in a graph. Source code in `pydantic_graph/pydantic_graph/persistence/__init__.py` ```python @dataclass(kw_only=True) class NodeSnapshot(Generic[StateT, RunEndT]): """History step describing the execution of a node in a graph.""" state: StateT """The state of the graph before the node is run.""" node: Annotated[BaseNode[StateT, Any, RunEndT], _utils.CustomNodeSchema()] """The node to run next.""" start_ts: datetime | None = None """The timestamp when the node started running, `None` until the run starts.""" duration: float | None = None """The duration of the node run in seconds, if the node has been run.""" status: SnapshotStatus = 'created' """The status of the snapshot.""" kind: Literal['node'] = 'node' """The kind of history step, can be used as a discriminator when deserializing history.""" id: str = UNSET_SNAPSHOT_ID """Unique ID of the snapshot.""" def __post_init__(self) -> None: if self.id == UNSET_SNAPSHOT_ID: self.id = self.node.get_snapshot_id() ``` #### state ```python state: StateT ``` The state of the graph before the node is run. #### node ```python node: Annotated[ BaseNode[StateT, Any, RunEndT], CustomNodeSchema() ] ``` The node to run next. #### start_ts ```python start_ts: datetime | None = None ``` The timestamp when the node started running, `None` until the run starts. #### duration ```python duration: float | None = None ``` The duration of the node run in seconds, if the node has been run. #### status ```python status: SnapshotStatus = 'created' ``` The status of the snapshot. #### kind ```python kind: Literal['node'] = 'node' ``` The kind of history step, can be used as a discriminator when deserializing history. #### id ```python id: str = UNSET_SNAPSHOT_ID ``` Unique ID of the snapshot. ### EndSnapshot Bases: `Generic[StateT, RunEndT]` History step describing the end of a graph run. Source code in `pydantic_graph/pydantic_graph/persistence/__init__.py` ```python @dataclass(kw_only=True) class EndSnapshot(Generic[StateT, RunEndT]): """History step describing the end of a graph run.""" state: StateT """The state of the graph at the end of the run.""" result: End[RunEndT] """The result of the graph run.""" ts: datetime = field(default_factory=_utils.now_utc) """The timestamp when the graph run ended.""" kind: Literal['end'] = 'end' """The kind of history step, can be used as a discriminator when deserializing history.""" id: str = UNSET_SNAPSHOT_ID """Unique ID of the snapshot.""" def __post_init__(self) -> None: if self.id == UNSET_SNAPSHOT_ID: self.id = self.node.get_snapshot_id() @property def node(self) -> End[RunEndT]: """Shim to get the [`result`][pydantic_graph.persistence.EndSnapshot.result]. Useful to allow `[snapshot.node for snapshot in persistence.history]`. """ return self.result ``` #### state ```python state: StateT ``` The state of the graph at the end of the run. #### result ```python result: End[RunEndT] ``` The result of the graph run. #### ts ```python ts: datetime = field(default_factory=now_utc) ``` The timestamp when the graph run ended. #### kind ```python kind: Literal['end'] = 'end' ``` The kind of history step, can be used as a discriminator when deserializing history. #### id ```python id: str = UNSET_SNAPSHOT_ID ``` Unique ID of the snapshot. #### node ```python node: End[RunEndT] ``` Shim to get the result. Useful to allow `[snapshot.node for snapshot in persistence.history]`. ### Snapshot ```python Snapshot = ( NodeSnapshot[StateT, RunEndT] | EndSnapshot[StateT, RunEndT] ) ``` A step in the history of a graph run. Graph.run returns a list of these steps describing the execution of the graph, together with the run return value. ### BaseStatePersistence Bases: `ABC`, `Generic[StateT, RunEndT]` Abstract base class for storing the state of a graph run. Each instance of a `BaseStatePersistence` subclass should be used for a single graph run. Source code in `pydantic_graph/pydantic_graph/persistence/__init__.py` ```python class BaseStatePersistence(ABC, Generic[StateT, RunEndT]): """Abstract base class for storing the state of a graph run. Each instance of a `BaseStatePersistence` subclass should be used for a single graph run. """ @abstractmethod async def snapshot_node(self, state: StateT, next_node: BaseNode[StateT, Any, RunEndT]) -> None: """Snapshot the state of a graph, when the next step is to run a node. This method should add a [`NodeSnapshot`][pydantic_graph.persistence.NodeSnapshot] to persistence. Args: state: The state of the graph. next_node: The next node to run. """ raise NotImplementedError @abstractmethod async def snapshot_node_if_new( self, snapshot_id: str, state: StateT, next_node: BaseNode[StateT, Any, RunEndT] ) -> None: """Snapshot the state of a graph if the snapshot ID doesn't already exist in persistence. This method will generally call [`snapshot_node`][pydantic_graph.persistence.BaseStatePersistence.snapshot_node] but should do so in an atomic way. Args: snapshot_id: The ID of the snapshot to check. state: The state of the graph. next_node: The next node to run. """ raise NotImplementedError @abstractmethod async def snapshot_end(self, state: StateT, end: End[RunEndT]) -> None: """Snapshot the state of a graph when the graph has ended. This method should add an [`EndSnapshot`][pydantic_graph.persistence.EndSnapshot] to persistence. Args: state: The state of the graph. end: data from the end of the run. """ raise NotImplementedError @abstractmethod def record_run(self, snapshot_id: str) -> AbstractAsyncContextManager[None]: """Record the run of the node, or error if the node is already running. Args: snapshot_id: The ID of the snapshot to record. Raises: GraphNodeRunningError: if the node status it not `'created'` or `'pending'`. LookupError: if the snapshot ID is not found in persistence. Returns: An async context manager that records the run of the node. In particular this should set: - [`NodeSnapshot.status`][pydantic_graph.persistence.NodeSnapshot.status] to `'running'` and [`NodeSnapshot.start_ts`][pydantic_graph.persistence.NodeSnapshot.start_ts] when the run starts. - [`NodeSnapshot.status`][pydantic_graph.persistence.NodeSnapshot.status] to `'success'` or `'error'` and [`NodeSnapshot.duration`][pydantic_graph.persistence.NodeSnapshot.duration] when the run finishes. """ raise NotImplementedError @abstractmethod async def load_next(self) -> NodeSnapshot[StateT, RunEndT] | None: """Retrieve a node snapshot with status `'created`' and set its status to `'pending'`. This is used by [`Graph.iter_from_persistence`][pydantic_graph.graph.Graph.iter_from_persistence] to get the next node to run. Returns: The snapshot, or `None` if no snapshot with status `'created`' exists. """ raise NotImplementedError @abstractmethod async def load_all(self) -> list[Snapshot[StateT, RunEndT]]: """Load the entire history of snapshots. `load_all` is not used by pydantic-graph itself, instead it's provided to make it convenient to get all [snapshots][pydantic_graph.persistence.Snapshot] from persistence. Returns: The list of snapshots. """ raise NotImplementedError def set_graph_types(self, graph: Graph[StateT, Any, RunEndT]) -> None: """Set the types of the state and run end from a graph. You generally won't need to customise this method, instead implement [`set_types`][pydantic_graph.persistence.BaseStatePersistence.set_types] and [`should_set_types`][pydantic_graph.persistence.BaseStatePersistence.should_set_types]. """ if self.should_set_types(): with _utils.set_nodes_type_context(graph.get_nodes()): self.set_types(*graph.inferred_types) def should_set_types(self) -> bool: """Whether types need to be set. Implementations should override this method to return `True` when types have not been set if they are needed. """ return False def set_types(self, state_type: type[StateT], run_end_type: type[RunEndT]) -> None: """Set the types of the state and run end. This can be used to create [type adapters][pydantic.TypeAdapter] for serializing and deserializing snapshots, e.g. with [`build_snapshot_list_type_adapter`][pydantic_graph.persistence.build_snapshot_list_type_adapter]. Args: state_type: The state type. run_end_type: The run end type. """ pass ``` #### snapshot_node ```python snapshot_node( state: StateT, next_node: BaseNode[StateT, Any, RunEndT] ) -> None ``` Snapshot the state of a graph, when the next step is to run a node. This method should add a NodeSnapshot to persistence. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `state` | `StateT` | The state of the graph. | *required* | | `next_node` | `BaseNode[StateT, Any, RunEndT]` | The next node to run. | *required* | Source code in `pydantic_graph/pydantic_graph/persistence/__init__.py` ```python @abstractmethod async def snapshot_node(self, state: StateT, next_node: BaseNode[StateT, Any, RunEndT]) -> None: """Snapshot the state of a graph, when the next step is to run a node. This method should add a [`NodeSnapshot`][pydantic_graph.persistence.NodeSnapshot] to persistence. Args: state: The state of the graph. next_node: The next node to run. """ raise NotImplementedError ``` #### snapshot_node_if_new ```python snapshot_node_if_new( snapshot_id: str, state: StateT, next_node: BaseNode[StateT, Any, RunEndT], ) -> None ``` Snapshot the state of a graph if the snapshot ID doesn't already exist in persistence. This method will generally call snapshot_node but should do so in an atomic way. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `snapshot_id` | `str` | The ID of the snapshot to check. | *required* | | `state` | `StateT` | The state of the graph. | *required* | | `next_node` | `BaseNode[StateT, Any, RunEndT]` | The next node to run. | *required* | Source code in `pydantic_graph/pydantic_graph/persistence/__init__.py` ```python @abstractmethod async def snapshot_node_if_new( self, snapshot_id: str, state: StateT, next_node: BaseNode[StateT, Any, RunEndT] ) -> None: """Snapshot the state of a graph if the snapshot ID doesn't already exist in persistence. This method will generally call [`snapshot_node`][pydantic_graph.persistence.BaseStatePersistence.snapshot_node] but should do so in an atomic way. Args: snapshot_id: The ID of the snapshot to check. state: The state of the graph. next_node: The next node to run. """ raise NotImplementedError ``` #### snapshot_end ```python snapshot_end(state: StateT, end: End[RunEndT]) -> None ``` Snapshot the state of a graph when the graph has ended. This method should add an EndSnapshot to persistence. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `state` | `StateT` | The state of the graph. | *required* | | `end` | `End[RunEndT]` | data from the end of the run. | *required* | Source code in `pydantic_graph/pydantic_graph/persistence/__init__.py` ```python @abstractmethod async def snapshot_end(self, state: StateT, end: End[RunEndT]) -> None: """Snapshot the state of a graph when the graph has ended. This method should add an [`EndSnapshot`][pydantic_graph.persistence.EndSnapshot] to persistence. Args: state: The state of the graph. end: data from the end of the run. """ raise NotImplementedError ``` #### record_run ```python record_run( snapshot_id: str, ) -> AbstractAsyncContextManager[None] ``` Record the run of the node, or error if the node is already running. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `snapshot_id` | `str` | The ID of the snapshot to record. | *required* | Raises: | Type | Description | | --- | --- | | `GraphNodeRunningError` | if the node status it not 'created' or 'pending'. | | `LookupError` | if the snapshot ID is not found in persistence. | Returns: | Type | Description | | --- | --- | | `AbstractAsyncContextManager[None]` | An async context manager that records the run of the node. | In particular this should set: - NodeSnapshot.status to `'running'` and NodeSnapshot.start_ts when the run starts. - NodeSnapshot.status to `'success'` or `'error'` and NodeSnapshot.duration when the run finishes. Source code in `pydantic_graph/pydantic_graph/persistence/__init__.py` ```python @abstractmethod def record_run(self, snapshot_id: str) -> AbstractAsyncContextManager[None]: """Record the run of the node, or error if the node is already running. Args: snapshot_id: The ID of the snapshot to record. Raises: GraphNodeRunningError: if the node status it not `'created'` or `'pending'`. LookupError: if the snapshot ID is not found in persistence. Returns: An async context manager that records the run of the node. In particular this should set: - [`NodeSnapshot.status`][pydantic_graph.persistence.NodeSnapshot.status] to `'running'` and [`NodeSnapshot.start_ts`][pydantic_graph.persistence.NodeSnapshot.start_ts] when the run starts. - [`NodeSnapshot.status`][pydantic_graph.persistence.NodeSnapshot.status] to `'success'` or `'error'` and [`NodeSnapshot.duration`][pydantic_graph.persistence.NodeSnapshot.duration] when the run finishes. """ raise NotImplementedError ``` #### load_next ```python load_next() -> NodeSnapshot[StateT, RunEndT] | None ``` Retrieve a node snapshot with status `'created`' and set its status to `'pending'`. This is used by Graph.iter_from_persistence to get the next node to run. Returns: The snapshot, or `None` if no snapshot with status `'created`' exists. Source code in `pydantic_graph/pydantic_graph/persistence/__init__.py` ```python @abstractmethod async def load_next(self) -> NodeSnapshot[StateT, RunEndT] | None: """Retrieve a node snapshot with status `'created`' and set its status to `'pending'`. This is used by [`Graph.iter_from_persistence`][pydantic_graph.graph.Graph.iter_from_persistence] to get the next node to run. Returns: The snapshot, or `None` if no snapshot with status `'created`' exists. """ raise NotImplementedError ``` #### load_all ```python load_all() -> list[Snapshot[StateT, RunEndT]] ``` Load the entire history of snapshots. `load_all` is not used by pydantic-graph itself, instead it's provided to make it convenient to get all snapshots from persistence. Returns: The list of snapshots. Source code in `pydantic_graph/pydantic_graph/persistence/__init__.py` ```python @abstractmethod async def load_all(self) -> list[Snapshot[StateT, RunEndT]]: """Load the entire history of snapshots. `load_all` is not used by pydantic-graph itself, instead it's provided to make it convenient to get all [snapshots][pydantic_graph.persistence.Snapshot] from persistence. Returns: The list of snapshots. """ raise NotImplementedError ``` #### set_graph_types ```python set_graph_types(graph: Graph[StateT, Any, RunEndT]) -> None ``` Set the types of the state and run end from a graph. You generally won't need to customise this method, instead implement set_types and should_set_types. Source code in `pydantic_graph/pydantic_graph/persistence/__init__.py` ```python def set_graph_types(self, graph: Graph[StateT, Any, RunEndT]) -> None: """Set the types of the state and run end from a graph. You generally won't need to customise this method, instead implement [`set_types`][pydantic_graph.persistence.BaseStatePersistence.set_types] and [`should_set_types`][pydantic_graph.persistence.BaseStatePersistence.should_set_types]. """ if self.should_set_types(): with _utils.set_nodes_type_context(graph.get_nodes()): self.set_types(*graph.inferred_types) ``` #### should_set_types ```python should_set_types() -> bool ``` Whether types need to be set. Implementations should override this method to return `True` when types have not been set if they are needed. Source code in `pydantic_graph/pydantic_graph/persistence/__init__.py` ```python def should_set_types(self) -> bool: """Whether types need to be set. Implementations should override this method to return `True` when types have not been set if they are needed. """ return False ``` #### set_types ```python set_types( state_type: type[StateT], run_end_type: type[RunEndT] ) -> None ``` Set the types of the state and run end. This can be used to create type adapters for serializing and deserializing snapshots, e.g. with build_snapshot_list_type_adapter. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `state_type` | `type[StateT]` | The state type. | *required* | | `run_end_type` | `type[RunEndT]` | The run end type. | *required* | Source code in `pydantic_graph/pydantic_graph/persistence/__init__.py` ```python def set_types(self, state_type: type[StateT], run_end_type: type[RunEndT]) -> None: """Set the types of the state and run end. This can be used to create [type adapters][pydantic.TypeAdapter] for serializing and deserializing snapshots, e.g. with [`build_snapshot_list_type_adapter`][pydantic_graph.persistence.build_snapshot_list_type_adapter]. Args: state_type: The state type. run_end_type: The run end type. """ pass ``` ### build_snapshot_list_type_adapter ```python build_snapshot_list_type_adapter( state_t: type[StateT], run_end_t: type[RunEndT] ) -> TypeAdapter[list[Snapshot[StateT, RunEndT]]] ``` Build a type adapter for a list of snapshots. This method should be called from within set_types where context variables will be set such that Pydantic can create a schema for NodeSnapshot.node. Source code in `pydantic_graph/pydantic_graph/persistence/__init__.py` ```python def build_snapshot_list_type_adapter( state_t: type[StateT], run_end_t: type[RunEndT] ) -> pydantic.TypeAdapter[list[Snapshot[StateT, RunEndT]]]: """Build a type adapter for a list of snapshots. This method should be called from within [`set_types`][pydantic_graph.persistence.BaseStatePersistence.set_types] where context variables will be set such that Pydantic can create a schema for [`NodeSnapshot.node`][pydantic_graph.persistence.NodeSnapshot.node]. """ return pydantic.TypeAdapter(list[Annotated[Snapshot[state_t, run_end_t], pydantic.Discriminator('kind')]]) ``` In memory state persistence. This module provides simple in memory state persistence for graphs. ### SimpleStatePersistence Bases: `BaseStatePersistence[StateT, RunEndT]` Simple in memory state persistence that just hold the latest snapshot. If no state persistence implementation is provided when running a graph, this is used by default. Source code in `pydantic_graph/pydantic_graph/persistence/in_mem.py` ```python @dataclass class SimpleStatePersistence(BaseStatePersistence[StateT, RunEndT]): """Simple in memory state persistence that just hold the latest snapshot. If no state persistence implementation is provided when running a graph, this is used by default. """ last_snapshot: Snapshot[StateT, RunEndT] | None = None """The last snapshot.""" async def snapshot_node(self, state: StateT, next_node: BaseNode[StateT, Any, RunEndT]) -> None: self.last_snapshot = NodeSnapshot(state=state, node=next_node) async def snapshot_node_if_new( self, snapshot_id: str, state: StateT, next_node: BaseNode[StateT, Any, RunEndT] ) -> None: if self.last_snapshot and self.last_snapshot.id == snapshot_id: return # pragma: no cover else: await self.snapshot_node(state, next_node) async def snapshot_end(self, state: StateT, end: End[RunEndT]) -> None: self.last_snapshot = EndSnapshot(state=state, result=end) @asynccontextmanager async def record_run(self, snapshot_id: str) -> AsyncIterator[None]: if self.last_snapshot is None or snapshot_id != self.last_snapshot.id: raise LookupError(f'No snapshot found with id={snapshot_id!r}') assert isinstance(self.last_snapshot, NodeSnapshot), 'Only NodeSnapshot can be recorded' exceptions.GraphNodeStatusError.check(self.last_snapshot.status) self.last_snapshot.status = 'running' self.last_snapshot.start_ts = _utils.now_utc() start = perf_counter() try: yield except Exception: self.last_snapshot.duration = perf_counter() - start self.last_snapshot.status = 'error' raise else: self.last_snapshot.duration = perf_counter() - start self.last_snapshot.status = 'success' async def load_next(self) -> NodeSnapshot[StateT, RunEndT] | None: if isinstance(self.last_snapshot, NodeSnapshot) and self.last_snapshot.status == 'created': self.last_snapshot.status = 'pending' return self.last_snapshot async def load_all(self) -> list[Snapshot[StateT, RunEndT]]: raise NotImplementedError('load is not supported for SimpleStatePersistence') ``` #### last_snapshot ```python last_snapshot: Snapshot[StateT, RunEndT] | None = None ``` The last snapshot. ### FullStatePersistence Bases: `BaseStatePersistence[StateT, RunEndT]` In memory state persistence that hold a list of snapshots. Source code in `pydantic_graph/pydantic_graph/persistence/in_mem.py` ```python @dataclass class FullStatePersistence(BaseStatePersistence[StateT, RunEndT]): """In memory state persistence that hold a list of snapshots.""" deep_copy: bool = True """Whether to deep copy the state and nodes when storing them. Defaults to `True` so even if nodes or state are modified after the snapshot is taken, the persistence history will record the value at the time of the snapshot. """ history: list[Snapshot[StateT, RunEndT]] = field(default_factory=list) """List of snapshots taken during the graph run.""" _snapshots_type_adapter: pydantic.TypeAdapter[list[Snapshot[StateT, RunEndT]]] | None = field( default=None, init=False, repr=False ) async def snapshot_node(self, state: StateT, next_node: BaseNode[StateT, Any, RunEndT]) -> None: snapshot = NodeSnapshot( state=self._prep_state(state), node=next_node.deep_copy() if self.deep_copy else next_node, ) self.history.append(snapshot) async def snapshot_node_if_new( self, snapshot_id: str, state: StateT, next_node: BaseNode[StateT, Any, RunEndT] ) -> None: if not any(s.id == snapshot_id for s in self.history): await self.snapshot_node(state, next_node) async def snapshot_end(self, state: StateT, end: End[RunEndT]) -> None: snapshot = EndSnapshot( state=self._prep_state(state), result=end.deep_copy_data() if self.deep_copy else end, ) self.history.append(snapshot) @asynccontextmanager async def record_run(self, snapshot_id: str) -> AsyncIterator[None]: try: snapshot = next(s for s in self.history if s.id == snapshot_id) except StopIteration as e: raise LookupError(f'No snapshot found with id={snapshot_id!r}') from e assert isinstance(snapshot, NodeSnapshot), 'Only NodeSnapshot can be recorded' exceptions.GraphNodeStatusError.check(snapshot.status) snapshot.status = 'running' snapshot.start_ts = _utils.now_utc() start = perf_counter() try: yield except Exception: snapshot.duration = perf_counter() - start snapshot.status = 'error' raise else: snapshot.duration = perf_counter() - start snapshot.status = 'success' async def load_next(self) -> NodeSnapshot[StateT, RunEndT] | None: if snapshot := next((s for s in self.history if isinstance(s, NodeSnapshot) and s.status == 'created'), None): snapshot.status = 'pending' return snapshot async def load_all(self) -> list[Snapshot[StateT, RunEndT]]: return self.history def should_set_types(self) -> bool: return self._snapshots_type_adapter is None def set_types(self, state_type: type[StateT], run_end_type: type[RunEndT]) -> None: self._snapshots_type_adapter = build_snapshot_list_type_adapter(state_type, run_end_type) def dump_json(self, *, indent: int | None = None) -> bytes: """Dump the history to JSON bytes.""" assert self._snapshots_type_adapter is not None, 'type adapter must be set to use `dump_json`' return self._snapshots_type_adapter.dump_json(self.history, indent=indent) def load_json(self, json_data: str | bytes | bytearray) -> None: """Load the history from JSON.""" assert self._snapshots_type_adapter is not None, 'type adapter must be set to use `load_json`' self.history = self._snapshots_type_adapter.validate_json(json_data) def _prep_state(self, state: StateT) -> StateT: """Prepare state for snapshot, uses [`copy.deepcopy`][copy.deepcopy] by default.""" if not self.deep_copy or state is None: return state else: return copy.deepcopy(state) ``` #### deep_copy ```python deep_copy: bool = True ``` Whether to deep copy the state and nodes when storing them. Defaults to `True` so even if nodes or state are modified after the snapshot is taken, the persistence history will record the value at the time of the snapshot. #### history ```python history: list[Snapshot[StateT, RunEndT]] = field( default_factory=list ) ``` List of snapshots taken during the graph run. #### dump_json ```python dump_json(*, indent: int | None = None) -> bytes ``` Dump the history to JSON bytes. Source code in `pydantic_graph/pydantic_graph/persistence/in_mem.py` ```python def dump_json(self, *, indent: int | None = None) -> bytes: """Dump the history to JSON bytes.""" assert self._snapshots_type_adapter is not None, 'type adapter must be set to use `dump_json`' return self._snapshots_type_adapter.dump_json(self.history, indent=indent) ``` #### load_json ```python load_json(json_data: str | bytes | bytearray) -> None ``` Load the history from JSON. Source code in `pydantic_graph/pydantic_graph/persistence/in_mem.py` ```python def load_json(self, json_data: str | bytes | bytearray) -> None: """Load the history from JSON.""" assert self._snapshots_type_adapter is not None, 'type adapter must be set to use `load_json`' self.history = self._snapshots_type_adapter.validate_json(json_data) ``` ### FileStatePersistence Bases: `BaseStatePersistence[StateT, RunEndT]` File based state persistence that hold graph run state in a JSON file. Source code in `pydantic_graph/pydantic_graph/persistence/file.py` ````python @dataclass class FileStatePersistence(BaseStatePersistence[StateT, RunEndT]): """File based state persistence that hold graph run state in a JSON file.""" json_file: Path """Path to the JSON file where the snapshots are stored. You should use a different file for each graph run, but a single file should be reused for multiple steps of the same run. For example if you have a run ID of the form `run_123abc`, you might create a `FileStatePersistence` thus: ```py from pathlib import Path from pydantic_graph import FullStatePersistence run_id = 'run_123abc' persistence = FullStatePersistence(Path('runs') / f'{run_id}.json') ``` """ _snapshots_type_adapter: pydantic.TypeAdapter[list[Snapshot[StateT, RunEndT]]] | None = field( default=None, init=False, repr=False ) async def snapshot_node(self, state: StateT, next_node: BaseNode[StateT, Any, RunEndT]) -> None: await self._append_save(NodeSnapshot(state=state, node=next_node)) async def snapshot_node_if_new( self, snapshot_id: str, state: StateT, next_node: BaseNode[StateT, Any, RunEndT] ) -> None: async with self._lock(): snapshots = await self.load_all() if not any(s.id == snapshot_id for s in snapshots): # pragma: no branch await self._append_save(NodeSnapshot(state=state, node=next_node), lock=False) async def snapshot_end(self, state: StateT, end: End[RunEndT]) -> None: await self._append_save(EndSnapshot(state=state, result=end)) @asynccontextmanager async def record_run(self, snapshot_id: str) -> AsyncIterator[None]: async with self._lock(): snapshots = await self.load_all() try: snapshot = next(s for s in snapshots if s.id == snapshot_id) except StopIteration as e: raise LookupError(f'No snapshot found with id={snapshot_id!r}') from e assert isinstance(snapshot, NodeSnapshot), 'Only NodeSnapshot can be recorded' exceptions.GraphNodeStatusError.check(snapshot.status) snapshot.status = 'running' snapshot.start_ts = _utils.now_utc() await self._save(snapshots) start = perf_counter() try: yield except Exception: duration = perf_counter() - start async with self._lock(): await _graph_utils.run_in_executor(self._after_run_sync, snapshot_id, duration, 'error') raise else: snapshot.duration = perf_counter() - start async with self._lock(): await _graph_utils.run_in_executor(self._after_run_sync, snapshot_id, snapshot.duration, 'success') async def load_next(self) -> NodeSnapshot[StateT, RunEndT] | None: async with self._lock(): snapshots = await self.load_all() if snapshot := next((s for s in snapshots if isinstance(s, NodeSnapshot) and s.status == 'created'), None): snapshot.status = 'pending' await self._save(snapshots) return snapshot def should_set_types(self) -> bool: """Whether types need to be set.""" return self._snapshots_type_adapter is None def set_types(self, state_type: type[StateT], run_end_type: type[RunEndT]) -> None: self._snapshots_type_adapter = build_snapshot_list_type_adapter(state_type, run_end_type) async def load_all(self) -> list[Snapshot[StateT, RunEndT]]: return await _graph_utils.run_in_executor(self._load_sync) def _load_sync(self) -> list[Snapshot[StateT, RunEndT]]: assert self._snapshots_type_adapter is not None, 'snapshots type adapter must be set' try: content = self.json_file.read_bytes() except FileNotFoundError: return [] else: return self._snapshots_type_adapter.validate_json(content) def _after_run_sync(self, snapshot_id: str, duration: float, status: SnapshotStatus) -> None: snapshots = self._load_sync() snapshot = next(s for s in snapshots if s.id == snapshot_id) assert isinstance(snapshot, NodeSnapshot), 'Only NodeSnapshot can be recorded' snapshot.duration = duration snapshot.status = status self._save_sync(snapshots) async def _save(self, snapshots: list[Snapshot[StateT, RunEndT]]) -> None: await _graph_utils.run_in_executor(self._save_sync, snapshots) def _save_sync(self, snapshots: list[Snapshot[StateT, RunEndT]]) -> None: assert self._snapshots_type_adapter is not None, 'snapshots type adapter must be set' self.json_file.write_bytes(self._snapshots_type_adapter.dump_json(snapshots, indent=2)) async def _append_save(self, snapshot: Snapshot[StateT, RunEndT], *, lock: bool = True) -> None: assert self._snapshots_type_adapter is not None, 'snapshots type adapter must be set' async with AsyncExitStack() as stack: if lock: await stack.enter_async_context(self._lock()) snapshots = await self.load_all() snapshots.append(snapshot) await self._save(snapshots) @asynccontextmanager async def _lock(self, *, timeout: float = 1.0) -> AsyncIterator[None]: """Lock a file by checking and writing a `.pydantic-graph-persistence-lock` to it. Args: timeout: how long to wait for the lock Returns: an async context manager that holds the lock """ lock_file = self.json_file.parent / f'{self.json_file.name}.pydantic-graph-persistence-lock' lock_id = secrets.token_urlsafe().encode() with anyio.fail_after(timeout): while not await _file_append_check(lock_file, lock_id): await anyio.sleep(0.01) try: yield finally: await _graph_utils.run_in_executor(lock_file.unlink, missing_ok=True) ```` #### json_file ```python json_file: Path ``` Path to the JSON file where the snapshots are stored. You should use a different file for each graph run, but a single file should be reused for multiple steps of the same run. For example if you have a run ID of the form `run_123abc`, you might create a `FileStatePersistence` thus: ```py from pathlib import Path from pydantic_graph import FullStatePersistence run_id = 'run_123abc' persistence = FullStatePersistence(Path('runs') / f'{run_id}.json') ``` #### should_set_types ```python should_set_types() -> bool ``` Whether types need to be set. Source code in `pydantic_graph/pydantic_graph/persistence/file.py` ```python def should_set_types(self) -> bool: """Whether types need to be set.""" return self._snapshots_type_adapter is None ``` # Evals # Evals "Evals" refers to evaluating a model's performance for a specific application. Warning Unlike unit tests, evals are an emerging art/science; anyone who claims to know for sure exactly how your evals should be defined can safely be ignored. Pydantic Evals is a powerful evaluation framework designed to help you systematically test and evaluate the performance and accuracy of the systems you build, especially when working with LLMs. We've designed Pydantic Evals to be useful while not being too opinionated since we (along with everyone else) are still figuring out best practices. We'd love your [feedback](../help/) on the package and how we can improve it. In Beta Pydantic Evals support was [introduced](https://github.com/pydantic/pydantic-ai/pull/935) in v0.0.47 and is currently in beta. The API is subject to change and the documentation is incomplete. ## Installation To install the Pydantic Evals package, run: ```bash pip install pydantic-evals ``` ```bash uv add pydantic-evals ``` `pydantic-evals` does not depend on `pydantic-ai`, but has an optional dependency on `logfire` if you'd like to use OpenTelemetry traces in your evals, or send evaluation results to [logfire](https://pydantic.dev/logfire). ```bash pip install 'pydantic-evals[logfire]' ``` ```bash uv add 'pydantic-evals[logfire]' ``` ## Datasets and Cases In Pydantic Evals, everything begins with `Dataset`s and `Case`s: - Case: A single test scenario corresponding to "task" inputs. Can also optionally have a name, expected outputs, metadata, and evaluators. - Dataset: A collection of test cases designed for the evaluation of a specific task or function. simple_eval_dataset.py ```python from pydantic_evals import Case, Dataset case1 = Case( name='simple_case', inputs='What is the capital of France?', expected_output='Paris', metadata={'difficulty': 'easy'}, ) dataset = Dataset(cases=[case1]) ``` *(This example is complete, it can be run "as is")* ## Evaluators Evaluators are the components that analyze and score the results of your task when tested against a case. Pydantic Evals includes several built-in evaluators and allows you to create custom evaluators: simple_eval_evaluator.py ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext from pydantic_evals.evaluators.common import IsInstance from simple_eval_dataset import dataset dataset.add_evaluator(IsInstance(type_name='str')) # (1)! @dataclass class MyEvaluator(Evaluator): async def evaluate(self, ctx: EvaluatorContext[str, str]) -> float: # (2)! if ctx.output == ctx.expected_output: return 1.0 elif ( isinstance(ctx.output, str) and ctx.expected_output.lower() in ctx.output.lower() ): return 0.8 else: return 0.0 dataset.add_evaluator(MyEvaluator()) ``` 1. You can add built-in evaluators to a dataset using the add_evaluator method. 1. This custom evaluator returns a simple score based on whether the output matches the expected output. *(This example is complete, it can be run "as is")* ## Evaluation Process The evaluation process involves running a task against all cases in a dataset: Putting the above two examples together and using the more declarative `evaluators` kwarg to Dataset: simple_eval_complete.py ```python from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import Evaluator, EvaluatorContext, IsInstance case1 = Case( # (1)! name='simple_case', inputs='What is the capital of France?', expected_output='Paris', metadata={'difficulty': 'easy'}, ) class MyEvaluator(Evaluator[str, str]): def evaluate(self, ctx: EvaluatorContext[str, str]) -> float: if ctx.output == ctx.expected_output: return 1.0 elif ( isinstance(ctx.output, str) and ctx.expected_output.lower() in ctx.output.lower() ): return 0.8 else: return 0.0 dataset = Dataset( cases=[case1], evaluators=[IsInstance(type_name='str'), MyEvaluator()], # (3)! ) async def guess_city(question: str) -> str: # (4)! return 'Paris' report = dataset.evaluate_sync(guess_city) # (5)! report.print(include_input=True, include_output=True, include_durations=False) # (6)! """ Evaluation Summary: guess_city ┏━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓ ┃ Case ID ┃ Inputs ┃ Outputs ┃ Scores ┃ Assertions ┃ ┡━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩ │ simple_case │ What is the capital of France? │ Paris │ MyEvaluator: 1.00 │ ✔ │ ├─────────────┼────────────────────────────────┼─────────┼───────────────────┼────────────┤ │ Averages │ │ │ MyEvaluator: 1.00 │ 100.0% ✔ │ └─────────────┴────────────────────────────────┴─────────┴───────────────────┴────────────┘ """ ``` 1. Create a test case as above 1. Also create a custom evaluator function as above 1. Create a Dataset with test cases, also set the evaluators when creating the dataset 1. Our function to evaluate. 1. Run the evaluation with evaluate_sync, which runs the function against all test cases in the dataset, and returns an EvaluationReport object. 1. Print the report with print, which shows the results of the evaluation, including input and output. We have omitted duration here just to keep the printed output from changing from run to run. *(This example is complete, it can be run "as is")* ## Evaluation with `LLMJudge` In this example we evaluate a method for generating recipes based on customer orders. judge_recipes.py ```python from __future__ import annotations from typing import Any from pydantic import BaseModel from pydantic_ai import Agent, format_as_xml from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import IsInstance, LLMJudge class CustomerOrder(BaseModel): # (1)! dish_name: str dietary_restriction: str | None = None class Recipe(BaseModel): ingredients: list[str] steps: list[str] recipe_agent = Agent( 'groq:llama-3.3-70b-versatile', output_type=Recipe, system_prompt=( 'Generate a recipe to cook the dish that meets the dietary restrictions.' ), ) async def transform_recipe(customer_order: CustomerOrder) -> Recipe: # (2)! r = await recipe_agent.run(format_as_xml(customer_order)) return r.output recipe_dataset = Dataset[CustomerOrder, Recipe, Any]( # (3)! cases=[ Case( name='vegetarian_recipe', inputs=CustomerOrder( dish_name='Spaghetti Bolognese', dietary_restriction='vegetarian' ), expected_output=None, # (4) metadata={'focus': 'vegetarian'}, evaluators=( LLMJudge( # (5)! rubric='Recipe should not contain meat or animal products', ), ), ), Case( name='gluten_free_recipe', inputs=CustomerOrder( dish_name='Chocolate Cake', dietary_restriction='gluten-free' ), expected_output=None, metadata={'focus': 'gluten-free'}, # Case-specific evaluator with a focused rubric evaluators=( LLMJudge( rubric='Recipe should not contain gluten or wheat products', ), ), ), ], evaluators=[ # (6)! IsInstance(type_name='Recipe'), LLMJudge( rubric='Recipe should have clear steps and relevant ingredients', include_input=True, model='anthropic:claude-3-7-sonnet-latest', # (7)! ), ], ) report = recipe_dataset.evaluate_sync(transform_recipe) print(report) """ Evaluation Summary: transform_recipe ┏━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━┓ ┃ Case ID ┃ Assertions ┃ Duration ┃ ┡━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━┩ │ vegetarian_recipe │ ✔✔✔ │ 10ms │ ├────────────────────┼────────────┼──────────┤ │ gluten_free_recipe │ ✔✔✔ │ 10ms │ ├────────────────────┼────────────┼──────────┤ │ Averages │ 100.0% ✔ │ 10ms │ └────────────────────┴────────────┴──────────┘ """ ``` 1. Define models for our task — Input for recipe generation task and output of the task. 1. Define our recipe generation function - this is the task we want to evaluate. 1. Create a dataset with different test cases and different rubrics. 1. No expected output, we'll let the LLM judge the quality. 1. Case-specific evaluator with a focused rubric using LLMJudge. 1. Dataset-level evaluators that apply to all cases, including a general quality rubric for all recipes 1. By default `LLMJudge` uses `openai:gpt-4o`, here we use a specific Anthropic model. *(This example is complete, it can be run "as is")* ## Saving and Loading Datasets Datasets can be saved to and loaded from YAML or JSON files. save_load_dataset_example.py ```python from pathlib import Path from judge_recipes import CustomerOrder, Recipe, recipe_dataset from pydantic_evals import Dataset recipe_transforms_file = Path('recipe_transform_tests.yaml') recipe_dataset.to_file(recipe_transforms_file) # (1)! print(recipe_transforms_file.read_text()) """ # yaml-language-server: $schema=recipe_transform_tests_schema.json cases: - name: vegetarian_recipe inputs: dish_name: Spaghetti Bolognese dietary_restriction: vegetarian metadata: focus: vegetarian evaluators: - LLMJudge: Recipe should not contain meat or animal products - name: gluten_free_recipe inputs: dish_name: Chocolate Cake dietary_restriction: gluten-free metadata: focus: gluten-free evaluators: - LLMJudge: Recipe should not contain gluten or wheat products evaluators: - IsInstance: Recipe - LLMJudge: rubric: Recipe should have clear steps and relevant ingredients model: anthropic:claude-3-7-sonnet-latest include_input: true """ # Load dataset from file loaded_dataset = Dataset[CustomerOrder, Recipe, dict].from_file(recipe_transforms_file) print(f'Loaded dataset with {len(loaded_dataset.cases)} cases') #> Loaded dataset with 2 cases ``` *(This example is complete, it can be run "as is")* ## Parallel Evaluation You can control concurrency during evaluation (this might be useful to prevent exceeding a rate limit): parallel_evaluation_example.py ```python import asyncio import time from pydantic_evals import Case, Dataset # Create a dataset with multiple test cases dataset = Dataset( cases=[ Case( name=f'case_{i}', inputs=i, expected_output=i * 2, ) for i in range(5) ] ) async def double_number(input_value: int) -> int: """Function that simulates work by sleeping for a tenth of a second before returning double the input.""" await asyncio.sleep(0.1) # Simulate work return input_value * 2 # Run evaluation with unlimited concurrency t0 = time.time() report_default = dataset.evaluate_sync(double_number) print(f'Evaluation took less than 0.5s: {time.time() - t0 < 0.5}') #> Evaluation took less than 0.5s: True report_default.print(include_input=True, include_output=True, include_durations=False) # (1)! """ Evaluation Summary: double_number ┏━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━┓ ┃ Case ID ┃ Inputs ┃ Outputs ┃ ┡━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━┩ │ case_0 │ 0 │ 0 │ ├──────────┼────────┼─────────┤ │ case_1 │ 1 │ 2 │ ├──────────┼────────┼─────────┤ │ case_2 │ 2 │ 4 │ ├──────────┼────────┼─────────┤ │ case_3 │ 3 │ 6 │ ├──────────┼────────┼─────────┤ │ case_4 │ 4 │ 8 │ ├──────────┼────────┼─────────┤ │ Averages │ │ │ └──────────┴────────┴─────────┘ """ # Run evaluation with limited concurrency t0 = time.time() report_limited = dataset.evaluate_sync(double_number, max_concurrency=1) print(f'Evaluation took more than 0.5s: {time.time() - t0 > 0.5}') #> Evaluation took more than 0.5s: True report_limited.print(include_input=True, include_output=True, include_durations=False) # (2)! """ Evaluation Summary: double_number ┏━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━┓ ┃ Case ID ┃ Inputs ┃ Outputs ┃ ┡━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━┩ │ case_0 │ 0 │ 0 │ ├──────────┼────────┼─────────┤ │ case_1 │ 1 │ 2 │ ├──────────┼────────┼─────────┤ │ case_2 │ 2 │ 4 │ ├──────────┼────────┼─────────┤ │ case_3 │ 3 │ 6 │ ├──────────┼────────┼─────────┤ │ case_4 │ 4 │ 8 │ ├──────────┼────────┼─────────┤ │ Averages │ │ │ └──────────┴────────┴─────────┘ """ ``` 1. We have omitted duration here just to keep the printed output from changing from run to run. 1. We have omitted duration here just to keep the printed output from changing from run to run. *(This example is complete, it can be run "as is")* ## OpenTelemetry Integration Pydantic Evals integrates with OpenTelemetry for tracing. The EvaluatorContext includes a property called `span_tree` which returns a SpanTree. The `SpanTree` provides a way to query and analyze the spans generated during function execution. This provides a way to access the results of instrumentation during evaluation. Note If you just want to write unit tests that ensure that specific spans are produced during calls to your evaluation task, it's usually better to just use the `logfire.testing.capfire` fixture directly. There are two main ways this is useful. opentelemetry_example.py ```python import asyncio from typing import Any import logfire from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import Evaluator from pydantic_evals.evaluators.context import EvaluatorContext from pydantic_evals.otel.span_tree import SpanQuery logfire.configure( # ensure that an OpenTelemetry tracer is configured send_to_logfire='if-token-present' ) class SpanTracingEvaluator(Evaluator[str, str]): """Evaluator that analyzes the span tree generated during function execution.""" def evaluate(self, ctx: EvaluatorContext[str, str]) -> dict[str, Any]: # Get the span tree from the context span_tree = ctx.span_tree if span_tree is None: return {'has_spans': False, 'performance_score': 0.0} # Find all spans with "processing" in the name processing_spans = span_tree.find(lambda node: 'processing' in node.name) # Calculate total processing time total_processing_time = sum( (span.duration.total_seconds() for span in processing_spans), 0.0 ) # Check for error spans error_query: SpanQuery = {'name_contains': 'error'} has_errors = span_tree.any(error_query) # Calculate a performance score (lower is better) performance_score = 1.0 if total_processing_time < 1.0 else 0.5 return { 'has_spans': True, 'has_errors': has_errors, 'performance_score': 0 if has_errors else performance_score, } async def process_text(text: str) -> str: """Function that processes text with OpenTelemetry instrumentation.""" with logfire.span('process_text'): # Simulate initial processing with logfire.span('text_processing'): await asyncio.sleep(0.1) processed = text.strip().lower() # Simulate additional processing with logfire.span('additional_processing'): if 'error' in processed: with logfire.span('error_handling'): logfire.error(f'Error detected in text: {text}') return f'Error processing: {text}' await asyncio.sleep(0.2) processed = processed.replace(' ', '_') return f'Processed: {processed}' # Create test cases dataset = Dataset( cases=[ Case( name='normal_text', inputs='Hello World', expected_output='Processed: hello_world', ), Case( name='text_with_error', inputs='Contains error marker', expected_output='Error processing: Contains error marker', ), ], evaluators=[SpanTracingEvaluator()], ) # Run evaluation - spans are automatically captured since logfire is configured report = dataset.evaluate_sync(process_text) # Print the report report.print(include_input=True, include_output=True, include_durations=False) # (1)! """ Evaluation Summary: process_text ┏━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓ ┃ Case ID ┃ Inputs ┃ Outputs ┃ Scores ┃ Assertions ┃ ┡━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩ │ normal_text │ Hello World │ Processed: hello_world │ performance_score: 1.00 │ ✔✗ │ ├─────────────────┼───────────────────────┼─────────────────────────────────────────┼──────────────────────────┼────────────┤ │ text_with_error │ Contains error marker │ Error processing: Contains error marker │ performance_score: 0 │ ✔✔ │ ├─────────────────┼───────────────────────┼─────────────────────────────────────────┼──────────────────────────┼────────────┤ │ Averages │ │ │ performance_score: 0.500 │ 75.0% ✔ │ └─────────────────┴───────────────────────┴─────────────────────────────────────────┴──────────────────────────┴────────────┘ """ ``` 1. We have omitted duration here just to keep the printed output from changing from run to run. *(This example is complete, it can be run "as is")* ## Generating Test Datasets Pydantic Evals allows you to generate test datasets using LLMs with generate_dataset. Datasets can be generated in either JSON or YAML format, in both cases a JSON schema file is generated alongside the dataset and referenced in the dataset, so you should get type checking and auto-completion in your editor. generate_dataset_example.py ```python from __future__ import annotations from pathlib import Path from pydantic import BaseModel, Field from pydantic_evals import Dataset from pydantic_evals.generation import generate_dataset class QuestionInputs(BaseModel, use_attribute_docstrings=True): # (1)! """Model for question inputs.""" question: str """A question to answer""" context: str | None = None """Optional context for the question""" class AnswerOutput(BaseModel, use_attribute_docstrings=True): # (2)! """Model for expected answer outputs.""" answer: str """The answer to the question""" confidence: float = Field(ge=0, le=1) """Confidence level (0-1)""" class MetadataType(BaseModel, use_attribute_docstrings=True): # (3)! """Metadata model for test cases.""" difficulty: str """Difficulty level (easy, medium, hard)""" category: str """Question category""" async def main(): dataset = await generate_dataset( # (4)! dataset_type=Dataset[QuestionInputs, AnswerOutput, MetadataType], n_examples=2, extra_instructions=""" Generate question-answer pairs about world capitals and landmarks. Make sure to include both easy and challenging questions. """, ) output_file = Path('questions_cases.yaml') dataset.to_file(output_file) # (5)! print(output_file.read_text()) """ # yaml-language-server: $schema=questions_cases_schema.json cases: - name: Easy Capital Question inputs: question: What is the capital of France? metadata: difficulty: easy category: Geography expected_output: answer: Paris confidence: 0.95 evaluators: - EqualsExpected - name: Challenging Landmark Question inputs: question: Which world-famous landmark is located on the banks of the Seine River? metadata: difficulty: hard category: Landmarks expected_output: answer: Eiffel Tower confidence: 0.9 evaluators: - EqualsExpected """ ``` 1. Define the schema for the inputs to the task. 1. Define the schema for the expected outputs of the task. 1. Define the schema for the metadata of the test cases. 1. Call generate_dataset to create a Dataset with 2 cases confirming to the schema. 1. Save the dataset to a YAML file, this will also write `questions_cases_schema.json` with the schema JSON schema for `questions_cases.yaml` to make editing easier. The magic `yaml-language-server` comment is supported by at least vscode, jetbrains/pycharm (more details [here](https://github.com/redhat-developer/yaml-language-server#using-inlined-schema)). *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main(answer))` to run `main`)* You can also write datasets as JSON files: generate_dataset_example_json.py ```python from pathlib import Path from pydantic_evals import Dataset from pydantic_evals.generation import generate_dataset from generate_dataset_example import AnswerOutput, MetadataType, QuestionInputs async def main(): dataset = await generate_dataset( # (1)! dataset_type=Dataset[QuestionInputs, AnswerOutput, MetadataType], n_examples=2, extra_instructions=""" Generate question-answer pairs about world capitals and landmarks. Make sure to include both easy and challenging questions. """, ) output_file = Path('questions_cases.json') dataset.to_file(output_file) # (2)! print(output_file.read_text()) """ { "$schema": "questions_cases_schema.json", "cases": [ { "name": "Easy Capital Question", "inputs": { "question": "What is the capital of France?" }, "metadata": { "difficulty": "easy", "category": "Geography" }, "expected_output": { "answer": "Paris", "confidence": 0.95 }, "evaluators": [ "EqualsExpected" ] }, { "name": "Challenging Landmark Question", "inputs": { "question": "Which world-famous landmark is located on the banks of the Seine River?" }, "metadata": { "difficulty": "hard", "category": "Landmarks" }, "expected_output": { "answer": "Eiffel Tower", "confidence": 0.9 }, "evaluators": [ "EqualsExpected" ] } ] } """ ``` 1. Generate the Dataset exactly as above. 1. Save the dataset to a JSON file, this will also write `questions_cases_schema.json` with th JSON schema for `questions_cases.json`. This time the `$schema` key is included in the JSON file to define the schema for IDEs to use while you edit the file, there's no formal spec for this, but it works in vscode and pycharm and is discussed at length in [json-schema-org/json-schema-spec#828](https://github.com/json-schema-org/json-schema-spec/issues/828). *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main(answer))` to run `main`)* ## Integration with Logfire Pydantic Evals is implemented using OpenTelemetry to record traces of the evaluation process. These traces contain all the information included in the terminal output as attributes, but also include full tracing from the executions of the evaluation task function. You can send these traces to any OpenTelemetry-compatible backend, including [Pydantic Logfire](https://logfire.pydantic.dev/docs). All you need to do is configure Logfire via `logfire.configure`: logfire_integration.py ```python import logfire from judge_recipes import recipe_dataset, transform_recipe logfire.configure( send_to_logfire='if-token-present', # (1)! environment='development', # (2)! service_name='evals', # (3)! ) recipe_dataset.evaluate_sync(transform_recipe) ``` 1. The `send_to_logfire` argument controls when traces are sent to Logfire. You can set it to `'if-token-present'` to send data to Logfire only if the `LOGFIRE_TOKEN` environment variable is set. See the [Logfire configuration docs](https://logfire.pydantic.dev/docs/reference/configuration/) for more details. 1. The `environment` argument sets the environment for the traces. It's a good idea to set this to `'development'` when running tests or evaluations and sending data to a project with production data, to make it easier to filter these traces out while reviewing data from your production environment(s). 1. The `service_name` argument sets the service name for the traces. This is displayed in the Logfire UI to help you identify the source of the associated spans. Logfire has some special integration with Pydantic Evals traces, including a table view of the evaluation results on the evaluation root span (which is generated in each call to Dataset.evaluate): and a detailed view of the inputs and outputs for the execution of each case: In addition, any OpenTelemetry spans generated during the evaluation process will be sent to Logfire, allowing you to visualize the full execution of the code called during the evaluation process: This can be especially helpful when attempting to write evaluators that make use of the `span_tree` property of the EvaluatorContext, as described in the [OpenTelemetry Integration](#opentelemetry-integration) section above. This allows you to write evaluations that depend on information about which code paths were executed during the call to the task function without needing to manually instrument the code being evaluated, as long as the code being evaluated is already adequately instrumented with OpenTelemetry. In the case of Pydantic AI agents, for example, this can be used to ensure specific tools are (or are not) called during the execution of specific cases. Using OpenTelemetry in this way also means that all data used to evaluate the task executions will be accessible in the traces produced by production runs of the code, making it straightforward to perform the same evaluations on production data. # MCP # Client Pydantic AI can act as an [MCP client](https://modelcontextprotocol.io/quickstart/client), connecting to MCP servers to use their tools. ## Install You need to either install [`pydantic-ai`](../../install/), or[`pydantic-ai-slim`](../../install/#slim-install) with the `mcp` optional group: ```bash pip install "pydantic-ai-slim[mcp]" ``` ```bash uv add "pydantic-ai-slim[mcp]" ``` ## Usage Pydantic AI comes with two ways to connect to MCP servers: - MCPServerStreamableHTTP which connects to an MCP server using the [Streamable HTTP](https://modelcontextprotocol.io/introduction#streamable-http) transport - MCPServerSSE which connects to an MCP server using the [HTTP SSE](https://spec.modelcontextprotocol.io/specification/2024-11-05/basic/transports/#http-with-sse) transport - MCPServerStdio which runs the server as a subprocess and connects to it using the [stdio](https://spec.modelcontextprotocol.io/specification/2024-11-05/basic/transports/#stdio) transport Examples of all three are shown below. Each MCP server instance is a [toolset](../../toolsets/) and can be registered with an Agent using the `toolsets` argument. You can use the async with agent context manager to open and close connections to all registered servers (and in the case of stdio servers, start and stop the subprocesses) around the context where they'll be used in agent runs. You can also use async with server to manage the connection or subprocess of a specific server, for example if you'd like to use it with multiple agents. If you don't explicitly enter one of these context managers to set up the server, this will be done automatically when it's needed (e.g. to list the available tools or call a specific tool), but it's more efficient to do so around the entire context where you expect the servers to be used. ### Streamable HTTP Client MCPServerStreamableHTTP connects over HTTP using the [Streamable HTTP](https://modelcontextprotocol.io/introduction#streamable-http) transport to a server. Note MCPServerStreamableHTTP requires an MCP server to be running and accepting HTTP connections before running the agent. Running the server is not managed by Pydantic AI. Before creating the Streamable HTTP client, we need to run a server that supports the Streamable HTTP transport. streamable_http_server.py ```python from mcp.server.fastmcp import FastMCP app = FastMCP() @app.tool() def add(a: int, b: int) -> int: return a + b if __name__ == '__main__': app.run(transport='streamable-http') ``` Then we can create the client: mcp_streamable_http_client.py ```python from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerStreamableHTTP server = MCPServerStreamableHTTP('http://localhost:8000/mcp') # (1)! agent = Agent('openai:gpt-4o', toolsets=[server]) # (2)! async def main(): async with agent: # (3)! result = await agent.run('What is 7 plus 5?') print(result.output) #> The answer is 12. ``` 1. Define the MCP server with the URL used to connect. 1. Create an agent with the MCP server attached. 1. Create a client session to connect to the server. *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* **What's happening here?** - The model is receiving the prompt "how many days between 2000-01-01 and 2025-03-18?" - The model decides "Oh, I've got this `run_python_code` tool, that will be a good way to answer this question", and writes some python code to calculate the answer. - The model returns a tool call - Pydantic AI sends the tool call to the MCP server using the SSE transport - The model is called again with the return value of running the code - The model returns the final answer You can visualise this clearly, and even see the code that's run by adding three lines of code to instrument the example with [logfire](https://logfire.pydantic.dev/docs): mcp_sse_client_logfire.py ```python import logfire logfire.configure() logfire.instrument_pydantic_ai() ``` Will display as follows: ### SSE Client MCPServerSSE connects over HTTP using the [HTTP + Server Sent Events transport](https://spec.modelcontextprotocol.io/specification/2024-11-05/basic/transports/#http-with-sse) to a server. Note The SSE transport in MCP is deprecated, you should use Streamable HTTP instead. Before creating the SSE client, we need to run a server that supports the SSE transport. sse_server.py ```python from mcp.server.fastmcp import FastMCP app = FastMCP() @app.tool() def add(a: int, b: int) -> int: return a + b if __name__ == '__main__': app.run(transport='sse') ``` Then we can create the client: mcp_sse_client.py ```python from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerSSE server = MCPServerSSE('http://localhost:3001/sse') # (1)! agent = Agent('openai:gpt-4o', toolsets=[server]) # (2)! async def main(): async with agent: # (3)! result = await agent.run('What is 7 plus 5?') print(result.output) #> The answer is 12. ``` 1. Define the MCP server with the URL used to connect. 1. Create an agent with the MCP server attached. 1. Create a client session to connect to the server. *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* ### MCP "stdio" Server MCP also offers [stdio transport](https://spec.modelcontextprotocol.io/specification/2024-11-05/basic/transports/#stdio) where the server is run as a subprocess and communicates with the client over `stdin` and `stdout`. In this case, you'd use the MCPServerStdio class. In this example [mcp-run-python](https://github.com/pydantic/mcp-run-python) is used as the MCP server. mcp_stdio_client.py ```python from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerStdio server = MCPServerStdio( # (1)! 'uv', args=['run', 'mcp-run-python', 'stdio'], timeout=10 ) agent = Agent('openai:gpt-4o', toolsets=[server]) async def main(): async with agent: result = await agent.run('How many days between 2000-01-01 and 2025-03-18?') print(result.output) #> There are 9,208 days between January 1, 2000, and March 18, 2025. ``` 1. See [MCP Run Python](https://github.com/pydantic/mcp-run-python) for more information. ## Loading MCP Servers from Configuration Instead of creating MCP server instances individually in code, you can load multiple servers from a JSON configuration file using load_mcp_servers(). This is particularly useful when you need to manage multiple MCP servers or want to configure servers externally without modifying code. ### Configuration Format The configuration file should be a JSON file with an `mcpServers` object containing server definitions. Each server is identified by a unique key and contains the configuration for that server type: mcp_config.json ```json { "mcpServers": { "python-runner": { "command": "uv", "args": ["run", "mcp-run-python", "stdio"] }, "weather-api": { "url": "http://localhost:3001/sse" }, "calculator": { "url": "http://localhost:8000/mcp" } } } ``` Note The MCP server is only inferred to be an SSE server because of the `/sse` suffix. Any other server with the "url" field will be inferred to be a Streamable HTTP server. We made this decision given that the SSE transport is deprecated. ### Usage mcp_config_loader.py ```python from pydantic_ai import Agent from pydantic_ai.mcp import load_mcp_servers # Load all servers from configuration file servers = load_mcp_servers('mcp_config.json') # Create agent with all loaded servers agent = Agent('openai:gpt-5', toolsets=servers) async def main(): async with agent: result = await agent.run('What is 7 plus 5?') print(result.output) ``` *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* ## Tool call customisation The MCP servers provide the ability to set a `process_tool_call` which allows the customisation of tool call requests and their responses. A common use case for this is to inject metadata to the requests which the server call needs: mcp_process_tool_call.py ```python from typing import Any from pydantic_ai import Agent, RunContext from pydantic_ai.mcp import CallToolFunc, MCPServerStdio, ToolResult from pydantic_ai.models.test import TestModel async def process_tool_call( ctx: RunContext[int], call_tool: CallToolFunc, name: str, tool_args: dict[str, Any], ) -> ToolResult: """A tool call processor that passes along the deps.""" return await call_tool(name, tool_args, {'deps': ctx.deps}) server = MCPServerStdio('python', args=['mcp_server.py'], process_tool_call=process_tool_call) agent = Agent( model=TestModel(call_tools=['echo_deps']), deps_type=int, toolsets=[server] ) async def main(): async with agent: result = await agent.run('Echo with deps set to 42', deps=42) print(result.output) #> {"echo_deps":{"echo":"This is an echo message","deps":42}} ``` How to access the metadata is MCP server SDK specific. For example with the [MCP Python SDK](https://github.com/modelcontextprotocol/python-sdk), it is accessible via the [`ctx: Context`](https://github.com/modelcontextprotocol/python-sdk#context) argument that can be included on tool call handlers: mcp_server.py ```python from typing import Any from mcp.server.fastmcp import Context, FastMCP from mcp.server.session import ServerSession mcp = FastMCP('Pydantic AI MCP Server') log_level = 'unset' @mcp.tool() async def echo_deps(ctx: Context[ServerSession, None]) -> dict[str, Any]: """Echo the run context. Args: ctx: Context object containing request and session information. Returns: Dictionary with an echo message and the deps. """ await ctx.info('This is an info message') deps: Any = getattr(ctx.request_context.meta, 'deps') return {'echo': 'This is an echo message', 'deps': deps} if __name__ == '__main__': mcp.run() ``` ## Using Tool Prefixes to Avoid Naming Conflicts When connecting to multiple MCP servers that might provide tools with the same name, you can use the `tool_prefix` parameter to avoid naming conflicts. This parameter adds a prefix to all tool names from a specific server. This allows you to use multiple servers that might have overlapping tool names without conflicts: mcp_tool_prefix_http_client.py ```python from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerSSE # Create two servers with different prefixes weather_server = MCPServerSSE( 'http://localhost:3001/sse', tool_prefix='weather' # Tools will be prefixed with 'weather_' ) calculator_server = MCPServerSSE( 'http://localhost:3002/sse', tool_prefix='calc' # Tools will be prefixed with 'calc_' ) # Both servers might have a tool named 'get_data', but they'll be exposed as: # - 'weather_get_data' # - 'calc_get_data' agent = Agent('openai:gpt-4o', toolsets=[weather_server, calculator_server]) ``` ## Custom TLS / SSL configuration In some environments you need to tweak how HTTPS connections are established – for example to trust an internal Certificate Authority, present a client certificate for **mTLS**, or (during local development only!) disable certificate verification altogether. All HTTP-based MCP client classes (MCPServerStreamableHTTP and MCPServerSSE) expose an `http_client` parameter that lets you pass your own pre-configured [`httpx.AsyncClient`](https://www.python-httpx.org/async/). mcp_custom_tls_client.py ```python import ssl import httpx from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerSSE # Trust an internal / self-signed CA ssl_ctx = ssl.create_default_context(cafile='/etc/ssl/private/my_company_ca.pem') # OPTIONAL: if the server requires **mutual TLS** load your client certificate ssl_ctx.load_cert_chain(certfile='/etc/ssl/certs/client.crt', keyfile='/etc/ssl/private/client.key',) http_client = httpx.AsyncClient( verify=ssl_ctx, timeout=httpx.Timeout(10.0), ) server = MCPServerSSE( 'http://localhost:3001/sse', http_client=http_client, # (1)! ) agent = Agent('openai:gpt-4o', toolsets=[server]) async def main(): async with agent: result = await agent.run('How many days between 2000-01-01 and 2025-03-18?') print(result.output) #> There are 9,208 days between January 1, 2000, and March 18, 2025. ``` 1. When you supply `http_client`, Pydantic AI re-uses this client for every request. Anything supported by **httpx** (`verify`, `cert`, custom proxies, timeouts, etc.) therefore applies to all MCP traffic. ## MCP Sampling What is MCP Sampling? In MCP [sampling](https://modelcontextprotocol.io/docs/concepts/sampling) is a system by which an MCP server can make LLM calls via the MCP client - effectively proxying requests to an LLM via the client over whatever transport is being used. Sampling is extremely useful when MCP servers need to use Gen AI but you don't want to provision them each with their own LLM credentials or when a public MCP server would like the connecting client to pay for LLM calls. Confusingly it has nothing to do with the concept of "sampling" in observability, or frankly the concept of "sampling" in any other domain. Sampling Diagram Here's a mermaid diagram that may or may not make the data flow clearer: ``` sequenceDiagram participant LLM participant MCP_Client as MCP client participant MCP_Server as MCP server MCP_Client->>LLM: LLM call LLM->>MCP_Client: LLM tool call response MCP_Client->>MCP_Server: tool call MCP_Server->>MCP_Client: sampling "create message" MCP_Client->>LLM: LLM call LLM->>MCP_Client: LLM text response MCP_Client->>MCP_Server: sampling response MCP_Server->>MCP_Client: tool call response ``` Pydantic AI supports sampling as both a client and server. See the [server](../server/#mcp-sampling) documentation for details on how to use sampling within a server. Sampling is automatically supported by Pydantic AI agents when they act as a client. To be able to use sampling, an MCP server instance needs to have a sampling_model set. This can be done either directly on the server using the constructor keyword argument or the property, or by using agent.set_mcp_sampling_model() to set the agent's model or one specified as an argument as the sampling model on all MCP servers registered with that agent. Let's say we have an MCP server that wants to use sampling (in this case to generate an SVG as per the tool arguments). Sampling MCP Server generate_svg.py ````python import re from pathlib import Path from mcp import SamplingMessage from mcp.server.fastmcp import Context, FastMCP from mcp.types import TextContent app = FastMCP() @app.tool() async def image_generator(ctx: Context, subject: str, style: str) -> str: prompt = f'{subject=} {style=}' # `ctx.session.create_message` is the sampling call result = await ctx.session.create_message( [SamplingMessage(role='user', content=TextContent(type='text', text=prompt))], max_tokens=1_024, system_prompt='Generate an SVG image as per the user input', ) assert isinstance(result.content, TextContent) path = Path(f'{subject}_{style}.svg') # remove triple backticks if the svg was returned within markdown if m := re.search(r'^```\w*$(.+?)```$', result.content.text, re.S | re.M): path.write_text(m.group(1)) else: path.write_text(result.content.text) return f'See {path}' if __name__ == '__main__': # run the server via stdio app.run() ```` Using this server with an `Agent` will automatically allow sampling: sampling_mcp_client.py ```python from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerStdio server = MCPServerStdio('python', args=['generate_svg.py']) agent = Agent('openai:gpt-4o', toolsets=[server]) async def main(): async with agent: agent.set_mcp_sampling_model() result = await agent.run('Create an image of a robot in a punk style.') print(result.output) #> Image file written to robot_punk.svg. ``` *(This example is complete, it can be run "as is")* You can disallow sampling by setting allow_sampling=False when creating the server reference, e.g.: sampling_disallowed.py ```python from pydantic_ai.mcp import MCPServerStdio server = MCPServerStdio( 'python', args=['generate_svg.py'], allow_sampling=False, ) ``` ## Elicitation In MCP, [elicitation](https://modelcontextprotocol.io/docs/concepts/elicitation) allows a server to request for [structured input](https://modelcontextprotocol.io/specification/2025-06-18/client/elicitation#supported-schema-types) from the client for missing or additional context during a session. Elicitation let models essentially say "Hold on - I need to know X before i can continue" rather than requiring everything upfront or taking a shot in the dark. ### How Elicitation works Elicitation introduces a new protocol message type called [`ElicitRequest`](https://modelcontextprotocol.io/specification/2025-06-18/schema#elicitrequest), which is sent from the server to the client when it needs additional information. The client can then respond with an [`ElicitResult`](https://modelcontextprotocol.io/specification/2025-06-18/schema#elicitresult) or an `ErrorData` message. Here's a typical interaction: - User makes a request to the MCP server (e.g. "Book a table at that Italian place") - The server identifies that it needs more information (e.g. "Which Italian place?", "What date and time?") - The server sends an `ElicitRequest` to the client asking for the missing information. - The client receives the request, presents it to the user (e.g. via a terminal prompt, GUI dialog, or web interface). - User provides the requested information, `decline` or `cancel` the request. - The client sends an `ElicitResult` back to the server with the user's response. - With the structured data, the server can continue processing the original request. This allows for a more interactive and user-friendly experience, especially for multi-staged workflows. Instead of requiring all information upfront, the server can ask for it as needed, making the interaction feel more natural. ### Setting up Elicitation To enable elicitation, provide an elicitation_callback function when creating your MCP server instance: restaurant_server.py ```python from mcp.server.fastmcp import Context, FastMCP from pydantic import BaseModel, Field mcp = FastMCP(name='Restaurant Booking') class BookingDetails(BaseModel): """Schema for restaurant booking information.""" restaurant: str = Field(description='Choose a restaurant') party_size: int = Field(description='Number of people', ge=1, le=8) date: str = Field(description='Reservation date (DD-MM-YYYY)') @mcp.tool() async def book_table(ctx: Context) -> str: """Book a restaurant table with user input.""" # Ask user for booking details using Pydantic schema result = await ctx.elicit(message='Please provide your booking details:', schema=BookingDetails) if result.action == 'accept' and result.data: booking = result.data return f'✅ Booked table for {booking.party_size} at {booking.restaurant} on {booking.date}' elif result.action == 'decline': return 'No problem! Maybe another time.' else: # cancel return 'Booking cancelled.' if __name__ == '__main__': mcp.run(transport='stdio') ``` This server demonstrates elicitation by requesting structured booking details from the client when the `book_table` tool is called. Here's how to create a client that handles these elicitation requests: client_example.py ```python import asyncio from typing import Any from mcp.client.session import ClientSession from mcp.shared.context import RequestContext from mcp.types import ElicitRequestParams, ElicitResult from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerStdio async def handle_elicitation( context: RequestContext[ClientSession, Any, Any], params: ElicitRequestParams, ) -> ElicitResult: """Handle elicitation requests from MCP server.""" print(f'\n{params.message}') if not params.requestedSchema: response = input('Response: ') return ElicitResult(action='accept', content={'response': response}) # Collect data for each field properties = params.requestedSchema['properties'] data = {} for field, info in properties.items(): description = info.get('description', field) value = input(f'{description}: ') # Convert to proper type based on JSON schema if info.get('type') == 'integer': data[field] = int(value) else: data[field] = value # Confirm confirm = input('\nConfirm booking? (y/n/c): ').lower() if confirm == 'y': print('Booking details:', data) return ElicitResult(action='accept', content=data) elif confirm == 'n': return ElicitResult(action='decline') else: return ElicitResult(action='cancel') # Set up MCP server connection restaurant_server = MCPServerStdio( 'python', args=['restaurant_server.py'], elicitation_callback=handle_elicitation ) # Create agent agent = Agent('openai:gpt-4o', toolsets=[restaurant_server]) async def main(): """Run the agent to book a restaurant table.""" async with agent: result = await agent.run('Book me a table') print(f'\nResult: {result.output}') if __name__ == '__main__': asyncio.run(main()) ``` ### Supported Schema Types MCP elicitation supports string, number, boolean, and enum types with flat object structures only. These limitations ensure reliable cross-client compatibility. See [supported schema types](https://modelcontextprotocol.io/specification/2025-06-18/client/elicitation#supported-schema-types) for details. ### Security MCP Elicitation requires careful handling - servers must not request sensitive information, and clients must implement user approval controls with clear explanations. See [security considerations](https://modelcontextprotocol.io/specification/2025-06-18/client/elicitation#security-considerations) for details. # Model Context Protocol (MCP) Pydantic AI supports [Model Context Protocol (MCP)](https://modelcontextprotocol.io) in two ways: 1. [Agents](../../agents/) act as an MCP Client, connecting to MCP servers to use their tools, [learn more …](../client/) 1. Agents can be used within MCP servers, [learn more …](../server/) ## What is MCP? The Model Context Protocol is a standardized protocol that allow AI applications (including programmatic agents like Pydantic AI, coding agents like [cursor](https://www.cursor.com/), and desktop applications like [Claude Desktop](https://claude.ai/download)) to connect to external tools and services using a common interface. As with other protocols, the dream of MCP is that a wide range of applications can speak to each other without the need for specific integrations. There is a great list of MCP servers at [github.com/modelcontextprotocol/servers](https://github.com/modelcontextprotocol/servers). Some examples of what this means: - Pydantic AI could use a web search service implemented as an MCP server to implement a deep research agent - Cursor could connect to the [Pydantic Logfire](https://github.com/pydantic/logfire-mcp) MCP server to search logs, traces and metrics to gain context while fixing a bug - Pydantic AI, or any other MCP client could connect to our [Run Python](https://github.com/pydantic/mcp-run-python) MCP server to run arbitrary Python code in a sandboxed environment # Server Pydantic AI models can also be used within MCP Servers. ## MCP Server Here's a simple example of a [Python MCP server](https://github.com/modelcontextprotocol/python-sdk) using Pydantic AI within a tool call: mcp_server.py ```py from mcp.server.fastmcp import FastMCP from pydantic_ai import Agent server = FastMCP('Pydantic AI Server') server_agent = Agent( 'anthropic:claude-3-5-haiku-latest', system_prompt='always reply in rhyme' ) @server.tool() async def poet(theme: str) -> str: """Poem generator""" r = await server_agent.run(f'write a poem about {theme}') return r.output if __name__ == '__main__': server.run() ``` ## Simple client This server can be queried with any MCP client. Here is an example using the Python SDK directly: mcp_client.py ```py import asyncio import os from mcp import ClientSession, StdioServerParameters from mcp.client.stdio import stdio_client async def client(): server_params = StdioServerParameters( command='python', args=['mcp_server.py'], env=os.environ ) async with stdio_client(server_params) as (read, write): async with ClientSession(read, write) as session: await session.initialize() result = await session.call_tool('poet', {'theme': 'socks'}) print(result.content[0].text) """ Oh, socks, those garments soft and sweet, That nestle softly 'round our feet, From cotton, wool, or blended thread, They keep our toes from feeling dread. """ if __name__ == '__main__': asyncio.run(client()) ``` ## MCP Sampling What is MCP Sampling? See the [MCP client docs](../client/#mcp-sampling) for details of what MCP sampling is, and how you can support it when using Pydantic AI as an MCP client. When Pydantic AI agents are used within MCP servers, they can use sampling via MCPSamplingModel. We can extend the above example to use sampling so instead of connecting directly to the LLM, the agent calls back through the MCP client to make LLM calls. mcp_server_sampling.py ```py from mcp.server.fastmcp import Context, FastMCP from pydantic_ai import Agent from pydantic_ai.models.mcp_sampling import MCPSamplingModel server = FastMCP('Pydantic AI Server with sampling') server_agent = Agent(system_prompt='always reply in rhyme') @server.tool() async def poet(ctx: Context, theme: str) -> str: """Poem generator""" r = await server_agent.run(f'write a poem about {theme}', model=MCPSamplingModel(session=ctx.session)) return r.output if __name__ == '__main__': server.run() # run the server over stdio ``` The [above](#simple-client) client does not support sampling, so if you tried to use it with this server you'd get an error. The simplest way to support sampling in an MCP client is to [use](../client/#mcp-sampling) a Pydantic AI agent as the client, but if you wanted to support sampling with the vanilla MCP SDK, you could do so like this: mcp_client_sampling.py ```py import asyncio from typing import Any from mcp import ClientSession, StdioServerParameters from mcp.client.stdio import stdio_client from mcp.shared.context import RequestContext from mcp.types import ( CreateMessageRequestParams, CreateMessageResult, ErrorData, TextContent, ) async def sampling_callback( context: RequestContext[ClientSession, Any], params: CreateMessageRequestParams ) -> CreateMessageResult | ErrorData: print('sampling system prompt:', params.systemPrompt) #> sampling system prompt: always reply in rhyme print('sampling messages:', params.messages) """ sampling messages: [ SamplingMessage( role='user', content=TextContent( type='text', text='write a poem about socks', annotations=None, meta=None, ), ) ] """ # TODO get the response content by calling an LLM... response_content = 'Socks for a fox.' return CreateMessageResult( role='assistant', content=TextContent(type='text', text=response_content), model='fictional-llm', ) async def client(): server_params = StdioServerParameters(command='python', args=['mcp_server_sampling.py']) async with stdio_client(server_params) as (read, write): async with ClientSession(read, write, sampling_callback=sampling_callback) as session: await session.initialize() result = await session.call_tool('poet', {'theme': 'socks'}) print(result.content[0].text) #> Socks for a fox. if __name__ == '__main__': asyncio.run(client()) ``` *(This example is complete, it can be run "as is")* # Optional # Command Line Interface (CLI) **Pydantic AI** comes with a CLI, `clai` (pronounced "clay") which you can use to interact with various LLMs from the command line. It provides a convenient way to chat with language models and quickly get answers right in the terminal. We originally developed this CLI for our own use, but found ourselves using it so frequently that we decided to share it as part of the Pydantic AI package. We plan to continue adding new features, such as interaction with MCP servers, access to tools, and more. ## Usage You'll need to set an environment variable depending on the provider you intend to use. E.g. if you're using OpenAI, set the `OPENAI_API_KEY` environment variable: ```bash export OPENAI_API_KEY='your-api-key-here' ``` Then with [`uvx`](https://docs.astral.sh/uv/guides/tools/), run: ```bash uvx clai ``` Or to install `clai` globally [with `uv`](https://docs.astral.sh/uv/guides/tools/#installing-tools), run: ```bash uv tool install clai ... clai ``` Or with `pip`, run: ```bash pip install clai ... clai ``` Either way, running `clai` will start an interactive session where you can chat with the AI model. Special commands available in interactive mode: - `/exit`: Exit the session - `/markdown`: Show the last response in markdown format - `/multiline`: Toggle multiline input mode (use Ctrl+D to submit) - `/cp`: Copy the last response to clipboard ### Help To get help on the CLI, use the `--help` flag: ```bash uvx clai --help ``` ### Choose a model You can specify which model to use with the `--model` flag: ```bash uvx clai --model anthropic:claude-sonnet-4-0 ``` (a full list of models available can be printed with `uvx clai --list-models`) ### Custom Agents You can specify a custom agent using the `--agent` flag with a module path and variable name: custom_agent.py ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4.1', instructions='You always respond in Italian.') ``` Then run: ```bash uvx clai --agent custom_agent:agent "What's the weather today?" ``` The format must be `module:variable` where: - `module` is the importable Python module path - `variable` is the name of the Agent instance in that module Additionally, you can directly launch CLI mode from an `Agent` instance using `Agent.to_cli_sync()`: agent_to_cli_sync.py ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4.1', instructions='You always respond in Italian.') agent.to_cli_sync() ``` You can also use the async interface with `Agent.to_cli()`: agent_to_cli.py ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4.1', instructions='You always respond in Italian.') async def main(): await agent.to_cli() ``` *(You'll need to add `asyncio.run(main())` to run `main`)* ### Message History Both `Agent.to_cli()` and `Agent.to_cli_sync()` support a `message_history` parameter, allowing you to continue an existing conversation or provide conversation context: agent_with_history.py ```python from pydantic_ai import Agent from pydantic_ai.messages import ( ModelMessage, ModelRequest, ModelResponse, TextPart, UserPromptPart, ) agent = Agent('openai:gpt-4.1') # Create some conversation history message_history: list[ModelMessage] = [ ModelRequest([UserPromptPart(content='What is 2+2?')]), ModelResponse([TextPart(content='2+2 equals 4.')]) ] # Start CLI with existing conversation context agent.to_cli_sync(message_history=message_history) ``` The CLI will start with the provided conversation history, allowing the agent to refer back to previous exchanges and maintain context throughout the session. # Pydantic Logfire Debugging and Monitoring Applications that use LLMs have some challenges that are well known and understood: LLMs are **slow**, **unreliable** and **expensive**. These applications also have some challenges that most developers have encountered much less often: LLMs are **fickle** and **non-deterministic**. Subtle changes in a prompt can completely change a model's performance, and there's no `EXPLAIN` query you can run to understand why. Warning From a software engineers point of view, you can think of LLMs as the worst database you've ever heard of, but worse. If LLMs weren't so bloody useful, we'd never touch them. To build successful applications with LLMs, we need new tools to understand both model performance, and the behavior of applications that rely on them. LLM Observability tools that just let you understand how your model is performing are useless: making API calls to an LLM is easy, it's building that into an application that's hard. ## Pydantic Logfire [Pydantic Logfire](https://pydantic.dev/logfire) is an observability platform developed by the team who created and maintain Pydantic Validation and Pydantic AI. Logfire aims to let you understand your entire application: Gen AI, classic predictive AI, HTTP traffic, database queries and everything else a modern application needs, all using OpenTelemetry. Pydantic Logfire is a commercial product Logfire is a commercially supported, hosted platform with an extremely generous and perpetual [free tier](https://pydantic.dev/pricing/). You can sign up and start using Logfire in a couple of minutes. Logfire can also be self-hosted on the enterprise tier. Pydantic AI has built-in (but optional) support for Logfire. That means if the `logfire` package is installed and configured and agent instrumentation is enabled then detailed information about agent runs is sent to Logfire. Otherwise there's virtually no overhead and nothing is sent. Here's an example showing details of running the [Weather Agent](../examples/weather-agent/) in Logfire: A trace is generated for the agent run, and spans are emitted for each model request and tool call. ## Using Logfire To use Logfire, you'll need a Logfire [account](https://logfire.pydantic.dev). The Logfire Python SDK is included with `pydantic-ai`: ```bash pip install pydantic-ai ``` ```bash uv add pydantic-ai ``` Or if you're using the slim package, you can install it with the `logfire` optional group: ```bash pip install "pydantic-ai-slim[logfire]" ``` ```bash uv add "pydantic-ai-slim[logfire]" ``` Then authenticate your local environment with Logfire: ```bash logfire auth ``` ```bash uv run logfire auth ``` And configure a project to send data to: ```bash logfire projects new ``` ```bash uv run logfire projects new ``` (Or use an existing project with `logfire projects use`) This will write to a `.logfire` directory in the current working directory, which the Logfire SDK will use for configuration at run time. With that, you can start using Logfire to instrument Pydantic AI code: instrument_pydantic_ai.py ```python import logfire from pydantic_ai import Agent logfire.configure() # (1)! logfire.instrument_pydantic_ai() # (2)! agent = Agent('openai:gpt-4o', instructions='Be concise, reply with one sentence.') result = agent.run_sync('Where does "hello world" come from?') # (3)! print(result.output) """ The first known use of "hello, world" was in a 1974 textbook about the C programming language. """ ``` 1. logfire.configure() configures the SDK, by default it will find the write token from the `.logfire` directory, but you can also pass a token directly. 1. logfire.instrument_pydantic_ai() enables instrumentation of Pydantic AI. 1. Since we've enabled instrumentation, a trace will be generated for each run, with spans emitted for models calls and tool function execution *(This example is complete, it can be run "as is")* Which will display in Logfire thus: The [Logfire documentation](https://logfire.pydantic.dev/docs/) has more details on how to use Logfire, including how to instrument other libraries like [HTTPX](https://logfire.pydantic.dev/docs/integrations/http-clients/httpx/) and [FastAPI](https://logfire.pydantic.dev/docs/integrations/web-frameworks/fastapi/). Since Logfire is built on [OpenTelemetry](https://opentelemetry.io/), you can use the Logfire Python SDK to send data to any OpenTelemetry collector, see [below](#using-opentelemetry). ### Debugging To demonstrate how Logfire can let you visualise the flow of a Pydantic AI run, here's the view you get from Logfire while running the [chat app examples](../examples/chat-app/): ### Monitoring Performance We can also query data with SQL in Logfire to monitor the performance of an application. Here's a real world example of using Logfire to monitor Pydantic AI runs inside Logfire itself: ### Monitoring HTTP Requests "F\*\*k you, show me the prompt." As per Hamel Husain's influential 2024 blog post ["Fuck You, Show Me The Prompt."](https://hamel.dev/blog/posts/prompt/) (bear with the capitalization, the point is valid), it's often useful to be able to view the raw HTTP requests and responses made to model providers. To observe raw HTTP requests made to model providers, you can use Logfire's [HTTPX instrumentation](https://logfire.pydantic.dev/docs/integrations/http-clients/httpx/) since all provider SDKs use the [HTTPX](https://www.python-httpx.org/) library internally. with_logfire_instrument_httpx.py ```py import logfire from pydantic_ai import Agent logfire.configure() logfire.instrument_pydantic_ai() logfire.instrument_httpx(capture_all=True) # (1)! agent = Agent('openai:gpt-4o') result = agent.run_sync('What is the capital of France?') print(result.output) #> The capital of France is Paris. ``` 1. See the logfire.instrument_httpx docs more details, `capture_all=True` means both headers and body are captured for both the request and response. without_logfire_instrument_httpx.py ```py import logfire from pydantic_ai import Agent logfire.configure() logfire.instrument_pydantic_ai() agent = Agent('openai:gpt-4o') result = agent.run_sync('What is the capital of France?') print(result.output) #> The capital of France is Paris. ``` ## Using OpenTelemetry Pydantic AI's instrumentation uses [OpenTelemetry](https://opentelemetry.io/) (OTel), which Logfire is based on. This means you can debug and monitor Pydantic AI with any OpenTelemetry backend. Pydantic AI follows the [OpenTelemetry Semantic Conventions for Generative AI systems](https://opentelemetry.io/docs/specs/semconv/gen-ai/), so while we think you'll have the best experience using the Logfire platform , you should be able to use any OTel service with GenAI support. ### Logfire with an alternative OTel backend You can use the Logfire SDK completely freely and send the data to any OpenTelemetry backend. Here's an example of configuring the Logfire library to send data to the excellent [otel-tui](https://github.com/ymtdzzz/otel-tui) — an open source terminal based OTel backend and viewer (no association with Pydantic Validation). Run `otel-tui` with docker (see [the otel-tui readme](https://github.com/ymtdzzz/otel-tui) for more instructions): Terminal ```text docker run --rm -it -p 4318:4318 --name otel-tui ymtdzzz/otel-tui:latest ``` then run, otel_tui.py ```python import os import logfire from pydantic_ai import Agent os.environ['OTEL_EXPORTER_OTLP_ENDPOINT'] = 'http://localhost:4318' # (1)! logfire.configure(send_to_logfire=False) # (2)! logfire.instrument_pydantic_ai() logfire.instrument_httpx(capture_all=True) agent = Agent('openai:gpt-4o') result = agent.run_sync('What is the capital of France?') print(result.output) #> Paris ``` 1. Set the `OTEL_EXPORTER_OTLP_ENDPOINT` environment variable to the URL of your OpenTelemetry backend. If you're using a backend that requires authentication, you may need to set [other environment variables](https://opentelemetry.io/docs/languages/sdk-configuration/otlp-exporter/). Of course, these can also be set outside the process, e.g. with `export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318`. 1. We configure Logfire to disable sending data to the Logfire OTel backend itself. If you removed `send_to_logfire=False`, data would be sent to both Logfire and your OpenTelemetry backend. Running the above code will send tracing data to `otel-tui`, which will display like this: Running the [weather agent](../examples/weather-agent/) example connected to `otel-tui` shows how it can be used to visualise a more complex trace: For more information on using the Logfire SDK to send data to alternative backends, see [the Logfire documentation](https://logfire.pydantic.dev/docs/how-to-guides/alternative-backends/). ### OTel without Logfire You can also emit OpenTelemetry data from Pydantic AI without using Logfire at all. To do this, you'll need to install and configure the OpenTelemetry packages you need. To run the following examples, use Terminal ```text uv run \ --with 'pydantic-ai-slim[openai]' \ --with opentelemetry-sdk \ --with opentelemetry-exporter-otlp \ raw_otel.py ``` raw_otel.py ```python import os from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter from opentelemetry.sdk.trace import TracerProvider from opentelemetry.sdk.trace.export import BatchSpanProcessor from opentelemetry.trace import set_tracer_provider from pydantic_ai import Agent os.environ['OTEL_EXPORTER_OTLP_ENDPOINT'] = 'http://localhost:4318' exporter = OTLPSpanExporter() span_processor = BatchSpanProcessor(exporter) tracer_provider = TracerProvider() tracer_provider.add_span_processor(span_processor) set_tracer_provider(tracer_provider) Agent.instrument_all() agent = Agent('openai:gpt-4o') result = agent.run_sync('What is the capital of France?') print(result.output) #> Paris ``` ### Alternative Observability backends Because Pydantic AI uses OpenTelemetry for observability, you can easily configure it to send data to any OpenTelemetry-compatible backend, not just our observability platform [Pydantic Logfire](#pydantic-logfire). The following providers have dedicated documentation on Pydantic AI: - [Langfuse](https://langfuse.com/docs/integrations/pydantic-ai) - [W&B Weave](https://weave-docs.wandb.ai/guides/integrations/pydantic_ai/) - [Arize](https://arize.com/docs/ax/observe/tracing-integrations-auto/pydantic-ai) - [Openlayer](https://www.openlayer.com/docs/integrations/pydantic-ai) - [OpenLIT](https://docs.openlit.io/latest/integrations/pydantic) - [LangWatch](https://docs.langwatch.ai/integration/python/integrations/pydantic-ai) - [Patronus AI](https://docs.patronus.ai/docs/percival/pydantic) - [Opik](https://www.comet.com/docs/opik/tracing/integrations/pydantic-ai) - [mlflow](https://mlflow.org/docs/latest/genai/tracing/integrations/listing/pydantic_ai) - [Agenta](https://docs.agenta.ai/observability/integrations/pydanticai) - [Confident AI](https://documentation.confident-ai.com/docs/llm-tracing/integrations/pydanticai) - [LangWatch](https://docs.langwatch.ai/integration/python/integrations/pydantic-ai) ## Advanced usage ### Configuring data format Pydantic AI follows the [OpenTelemetry Semantic Conventions for Generative AI systems](https://opentelemetry.io/docs/specs/semconv/gen-ai/). Specifically, it follows version 1.37.0 of the conventions by default. To use [version 1.36.0](https://github.com/open-telemetry/semantic-conventions/blob/v1.36.0/docs/gen-ai/README.md) or older, pass InstrumentationSettings(version=1) (the default is `version=2`). Moreover, those semantic conventions specify that messages should be captured as individual events (logs) that are children of the request span, whereas by default, Pydantic AI instead collects these events into a JSON array which is set as a single large attribute called `events` on the request span. To change this, use `event_mode='logs'`: instrumentation_settings_event_mode.py ```python import logfire from pydantic_ai import Agent logfire.configure() logfire.instrument_pydantic_ai(version=1, event_mode='logs') agent = Agent('openai:gpt-4o') result = agent.run_sync('What is the capital of France?') print(result.output) #> The capital of France is Paris. ``` This won't look as good in the Logfire UI, and will also be removed from Pydantic AI in a future release, but may be useful for backwards compatibility. Note that the OpenTelemetry Semantic Conventions are still experimental and are likely to change. ### Setting OpenTelemetry SDK providers By default, the global `TracerProvider` and `EventLoggerProvider` are used. These are set automatically by `logfire.configure()`. They can also be set by the `set_tracer_provider` and `set_event_logger_provider` functions in the OpenTelemetry Python SDK. You can set custom providers with InstrumentationSettings. instrumentation_settings_providers.py ```python from opentelemetry.sdk._events import EventLoggerProvider from opentelemetry.sdk.trace import TracerProvider from pydantic_ai import Agent, InstrumentationSettings instrumentation_settings = InstrumentationSettings( tracer_provider=TracerProvider(), event_logger_provider=EventLoggerProvider(), ) agent = Agent('openai:gpt-4o', instrument=instrumentation_settings) # or to instrument all agents: Agent.instrument_all(instrumentation_settings) ``` ### Instrumenting a specific `Model` instrumented_model_example.py ```python from pydantic_ai import Agent from pydantic_ai.models.instrumented import InstrumentationSettings, InstrumentedModel settings = InstrumentationSettings() model = InstrumentedModel('openai:gpt-4o', settings) agent = Agent(model) ``` ### Excluding binary content excluding_binary_content.py ```python from pydantic_ai import Agent, InstrumentationSettings instrumentation_settings = InstrumentationSettings(include_binary_content=False) agent = Agent('openai:gpt-4o', instrument=instrumentation_settings) # or to instrument all agents: Agent.instrument_all(instrumentation_settings) ``` ### Excluding prompts and completions For privacy and security reasons, you may want to monitor your agent's behavior and performance without exposing sensitive user data or proprietary prompts in your observability platform. Pydantic AI allows you to exclude the actual content from instrumentation events while preserving the structural information needed for debugging and monitoring. When `include_content=False` is set, Pydantic AI will exclude sensitive content from OpenTelemetry events, including user prompts and model completions, tool call arguments and responses, and any other message content. excluding_sensitive_content.py ```python from pydantic_ai import Agent from pydantic_ai.models.instrumented import InstrumentationSettings instrumentation_settings = InstrumentationSettings(include_content=False) agent = Agent('openai:gpt-4o', instrument=instrumentation_settings) # or to instrument all agents: Agent.instrument_all(instrumentation_settings) ``` This setting is particularly useful in production environments where compliance requirements or data sensitivity concerns make it necessary to limit what content is sent to your observability platform. # Unit testing Writing unit tests for Pydantic AI code is just like unit tests for any other Python code. Because for the most part they're nothing new, we have pretty well established tools and patterns for writing and running these kinds of tests. Unless you're really sure you know better, you'll probably want to follow roughly this strategy: - Use [`pytest`](https://docs.pytest.org/en/stable/) as your test harness - If you find yourself typing out long assertions, use [inline-snapshot](https://15r10nk.github.io/inline-snapshot/latest/) - Similarly, [dirty-equals](https://dirty-equals.helpmanual.io/latest/) can be useful for comparing large data structures - Use TestModel or FunctionModel in place of your actual model to avoid the usage, latency and variability of real LLM calls - Use Agent.override to replace an agent's model, dependencies, or toolsets inside your application logic - Set ALLOW_MODEL_REQUESTS=False globally to block any requests from being made to non-test models accidentally ### Unit testing with `TestModel` The simplest and fastest way to exercise most of your application code is using TestModel, this will (by default) call all tools in the agent, then return either plain text or a structured response depending on the return type of the agent. `TestModel` is not magic The "clever" (but not too clever) part of `TestModel` is that it will attempt to generate valid structured data for [function tools](../tools/) and [output types](../output/#structured-output) based on the schema of the registered tools. There's no ML or AI in `TestModel`, it's just plain old procedural Python code that tries to generate data that satisfies the JSON schema of a tool. The resulting data won't look pretty or relevant, but it should pass Pydantic's validation in most cases. If you want something more sophisticated, use FunctionModel and write your own data generation logic. Let's write unit tests for the following application code: weather_app.py ```python import asyncio from datetime import date from pydantic_ai import Agent, RunContext from fake_database import DatabaseConn # (1)! from weather_service import WeatherService # (2)! weather_agent = Agent( 'openai:gpt-4o', deps_type=WeatherService, system_prompt='Providing a weather forecast at the locations the user provides.', ) @weather_agent.tool def weather_forecast( ctx: RunContext[WeatherService], location: str, forecast_date: date ) -> str: if forecast_date < date.today(): # (3)! return ctx.deps.get_historic_weather(location, forecast_date) else: return ctx.deps.get_forecast(location, forecast_date) async def run_weather_forecast( # (4)! user_prompts: list[tuple[str, int]], conn: DatabaseConn ): """Run weather forecast for a list of user prompts and save.""" async with WeatherService() as weather_service: async def run_forecast(prompt: str, user_id: int): result = await weather_agent.run(prompt, deps=weather_service) await conn.store_forecast(user_id, result.output) # run all prompts in parallel await asyncio.gather( *(run_forecast(prompt, user_id) for (prompt, user_id) in user_prompts) ) ``` 1. `DatabaseConn` is a class that holds a database connection 1. `WeatherService` has methods to get weather forecasts and historic data about the weather 1. We need to call a different endpoint depending on whether the date is in the past or the future, you'll see why this nuance is important below 1. This function is the code we want to test, together with the agent it uses Here we have a function that takes a list of `(user_prompt, user_id)` tuples, gets a weather forecast for each prompt, and stores the result in the database. **We want to test this code without having to mock certain objects or modify our code so we can pass test objects in.** Here's how we would write tests using TestModel: test_weather_app.py ```python from datetime import timezone import pytest from dirty_equals import IsNow, IsStr from pydantic_ai import models, capture_run_messages, RequestUsage from pydantic_ai.models.test import TestModel from pydantic_ai.messages import ( ModelResponse, SystemPromptPart, TextPart, ToolCallPart, ToolReturnPart, UserPromptPart, ModelRequest, ) from fake_database import DatabaseConn from weather_app import run_weather_forecast, weather_agent pytestmark = pytest.mark.anyio # (1)! models.ALLOW_MODEL_REQUESTS = False # (2)! async def test_forecast(): conn = DatabaseConn() user_id = 1 with capture_run_messages() as messages: with weather_agent.override(model=TestModel()): # (3)! prompt = 'What will the weather be like in London on 2024-11-28?' await run_weather_forecast([(prompt, user_id)], conn) # (4)! forecast = await conn.get_forecast(user_id) assert forecast == '{"weather_forecast":"Sunny with a chance of rain"}' # (5)! assert messages == [ # (6)! ModelRequest( parts=[ SystemPromptPart( content='Providing a weather forecast at the locations the user provides.', timestamp=IsNow(tz=timezone.utc), ), UserPromptPart( content='What will the weather be like in London on 2024-11-28?', timestamp=IsNow(tz=timezone.utc), # (7)! ), ] ), ModelResponse( parts=[ ToolCallPart( tool_name='weather_forecast', args={ 'location': 'a', 'forecast_date': '2024-01-01', # (8)! }, tool_call_id=IsStr(), ) ], usage=RequestUsage( input_tokens=71, output_tokens=7, ), model_name='test', timestamp=IsNow(tz=timezone.utc), ), ModelRequest( parts=[ ToolReturnPart( tool_name='weather_forecast', content='Sunny with a chance of rain', tool_call_id=IsStr(), timestamp=IsNow(tz=timezone.utc), ), ], ), ModelResponse( parts=[ TextPart( content='{"weather_forecast":"Sunny with a chance of rain"}', ) ], usage=RequestUsage( input_tokens=77, output_tokens=16, ), model_name='test', timestamp=IsNow(tz=timezone.utc), ), ] ``` 1. We're using [anyio](https://anyio.readthedocs.io/en/stable/) to run async tests. 1. This is a safety measure to make sure we don't accidentally make real requests to the LLM while testing, see ALLOW_MODEL_REQUESTS for more details. 1. We're using Agent.override to replace the agent's model with TestModel, the nice thing about `override` is that we can replace the model inside agent without needing access to the agent `run*` methods call site. 1. Now we call the function we want to test inside the `override` context manager. 1. But default, `TestModel` will return a JSON string summarising the tools calls made, and what was returned. If you wanted to customise the response to something more closely aligned with the domain, you could add custom_output_text='Sunny' when defining `TestModel`. 1. So far we don't actually know which tools were called and with which values, we can use capture_run_messages to inspect messages from the most recent run and assert the exchange between the agent and the model occurred as expected. 1. The IsNow helper allows us to use declarative asserts even with data which will contain timestamps that change over time. 1. `TestModel` isn't doing anything clever to extract values from the prompt, so these values are hardcoded. ### Unit testing with `FunctionModel` The above tests are a great start, but careful readers will notice that the `WeatherService.get_forecast` is never called since `TestModel` calls `weather_forecast` with a date in the past. To fully exercise `weather_forecast`, we need to use FunctionModel to customise how the tools is called. Here's an example of using `FunctionModel` to test the `weather_forecast` tool with custom inputs test_weather_app2.py ```python import re import pytest from pydantic_ai import models from pydantic_ai.messages import ( ModelMessage, ModelResponse, TextPart, ToolCallPart, ) from pydantic_ai.models.function import AgentInfo, FunctionModel from fake_database import DatabaseConn from weather_app import run_weather_forecast, weather_agent pytestmark = pytest.mark.anyio models.ALLOW_MODEL_REQUESTS = False def call_weather_forecast( # (1)! messages: list[ModelMessage], info: AgentInfo ) -> ModelResponse: if len(messages) == 1: # first call, call the weather forecast tool user_prompt = messages[0].parts[-1] m = re.search(r'\d{4}-\d{2}-\d{2}', user_prompt.content) assert m is not None args = {'location': 'London', 'forecast_date': m.group()} # (2)! return ModelResponse(parts=[ToolCallPart('weather_forecast', args)]) else: # second call, return the forecast msg = messages[-1].parts[0] assert msg.part_kind == 'tool-return' return ModelResponse(parts=[TextPart(f'The forecast is: {msg.content}')]) async def test_forecast_future(): conn = DatabaseConn() user_id = 1 with weather_agent.override(model=FunctionModel(call_weather_forecast)): # (3)! prompt = 'What will the weather be like in London on 2032-01-01?' await run_weather_forecast([(prompt, user_id)], conn) forecast = await conn.get_forecast(user_id) assert forecast == 'The forecast is: Rainy with a chance of sun' ``` 1. We define a function `call_weather_forecast` that will be called by `FunctionModel` in place of the LLM, this function has access to the list of ModelMessages that make up the run, and AgentInfo which contains information about the agent and the function tools and return tools. 1. Our function is slightly intelligent in that it tries to extract a date from the prompt, but just hard codes the location. 1. We use FunctionModel to replace the agent's model with our custom function. ### Overriding model via pytest fixtures If you're writing lots of tests that all require model to be overridden, you can use [pytest fixtures](https://docs.pytest.org/en/6.2.x/fixture.html) to override the model with TestModel or FunctionModel in a reusable way. Here's an example of a fixture that overrides the model with `TestModel`: test_agent.py ```python import pytest from pydantic_ai.models.test import TestModel from weather_app import weather_agent @pytest.fixture def override_weather_agent(): with weather_agent.override(model=TestModel()): yield async def test_forecast(override_weather_agent: None): ... # test code here ``` # Agent User Interaction (AG-UI) Example of using Pydantic AI agents with the [AG-UI Dojo](https://github.com/ag-ui-protocol/ag-ui/tree/main/typescript-sdk/apps/dojo) example app. See the [AG-UI docs](../../ag-ui/) for more information about the AG-UI integration. Demonstrates: - [AG-UI](../../ag-ui/) - [Tools](../../tools/) ## Prerequisites - An [OpenAI API key](https://help.openai.com/en/articles/4936850-where-do-i-find-my-openai-api-key) ## Running the Example With [dependencies installed and environment variables set](../setup/#usage) you will need two command line windows. ### Pydantic AI AG-UI backend Setup your OpenAI API Key ```bash export OPENAI_API_KEY= ``` Start the Pydantic AI AG-UI example backend. ```bash python -m pydantic_ai_examples.ag_ui ``` ```bash uv run -m pydantic_ai_examples.ag_ui ``` ### AG-UI Dojo example frontend Next run the AG-UI Dojo example frontend. 1. Clone the [AG-UI repository](https://github.com/ag-ui-protocol/ag-ui) ```shell git clone https://github.com/ag-ui-protocol/ag-ui.git ``` 1. Change into to the `ag-ui/typescript-sdk` directory ```shell cd ag-ui/typescript-sdk ``` 1. Run the Dojo app following the [official instructions](https://github.com/ag-ui-protocol/ag-ui/tree/main/typescript-sdk/apps/dojo#development-setup) 1. Visit 1. Select View `Pydantic AI` from the sidebar ## Feature Examples ### Agentic Chat This demonstrates a basic agent interaction including Pydantic AI server side tools and AG-UI client side tools. If you've [run the example](#running-the-example), you can view it at . #### Agent Tools - `time` - Pydantic AI tool to check the current time for a time zone - `background` - AG-UI tool to set the background color of the client window #### Agent Prompts ```text What is the time in New York? ``` ```text Change the background to blue ``` A complex example which mixes both AG-UI and Pydantic AI tools: ```text Perform the following steps, waiting for the response of each step before continuing: 1. Get the time 2. Set the background to red 3. Get the time 4. Report how long the background set took by diffing the two times ``` #### Agentic Chat - Code [ag_ui/api/agentic_chat.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/ag_ui/api/agentic_chat.py) ```py """Agentic Chat feature.""" from __future__ import annotations from datetime import datetime from zoneinfo import ZoneInfo from pydantic_ai import Agent agent = Agent('openai:gpt-4o-mini') app = agent.to_ag_ui() @agent.tool_plain async def current_time(timezone: str = 'UTC') -> str: """Get the current time in ISO format. Args: timezone: The timezone to use. Returns: The current time in ISO format string. """ tz: ZoneInfo = ZoneInfo(timezone) return datetime.now(tz=tz).isoformat() ``` ### Agentic Generative UI Demonstrates a long running task where the agent sends updates to the frontend to let the user know what's happening. If you've [run the example](#running-the-example), you can view it at . #### Plan Prompts ```text Create a plan for breakfast and execute it ``` #### Agentic Generative UI - Code [ag_ui/api/agentic_generative_ui.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/ag_ui/api/agentic_generative_ui.py) ```py """Agentic Generative UI feature.""" from __future__ import annotations from textwrap import dedent from typing import Any, Literal from pydantic import BaseModel, Field from ag_ui.core import EventType, StateDeltaEvent, StateSnapshotEvent from pydantic_ai import Agent StepStatus = Literal['pending', 'completed'] class Step(BaseModel): """Represents a step in a plan.""" description: str = Field(description='The description of the step') status: StepStatus = Field( default='pending', description='The status of the step (e.g., pending, completed)', ) class Plan(BaseModel): """Represents a plan with multiple steps.""" steps: list[Step] = Field(default_factory=list, description='The steps in the plan') class JSONPatchOp(BaseModel): """A class representing a JSON Patch operation (RFC 6902).""" op: Literal['add', 'remove', 'replace', 'move', 'copy', 'test'] = Field( description='The operation to perform: add, remove, replace, move, copy, or test', ) path: str = Field(description='JSON Pointer (RFC 6901) to the target location') value: Any = Field( default=None, description='The value to apply (for add, replace operations)', ) from_: str | None = Field( default=None, alias='from', description='Source path (for move, copy operations)', ) agent = Agent( 'openai:gpt-4o-mini', instructions=dedent( """ When planning use tools only, without any other messages. IMPORTANT: - Use the `create_plan` tool to set the initial state of the steps - Use the `update_plan_step` tool to update the status of each step - Do NOT repeat the plan or summarise it in a message - Do NOT confirm the creation or updates in a message - Do NOT ask the user for additional information or next steps Only one plan can be active at a time, so do not call the `create_plan` tool again until all the steps in current plan are completed. """ ), ) @agent.tool_plain async def create_plan(steps: list[str]) -> StateSnapshotEvent: """Create a plan with multiple steps. Args: steps: List of step descriptions to create the plan. Returns: StateSnapshotEvent containing the initial state of the steps. """ plan: Plan = Plan( steps=[Step(description=step) for step in steps], ) return StateSnapshotEvent( type=EventType.STATE_SNAPSHOT, snapshot=plan.model_dump(), ) @agent.tool_plain async def update_plan_step( index: int, description: str | None = None, status: StepStatus | None = None ) -> StateDeltaEvent: """Update the plan with new steps or changes. Args: index: The index of the step to update. description: The new description for the step. status: The new status for the step. Returns: StateDeltaEvent containing the changes made to the plan. """ changes: list[JSONPatchOp] = [] if description is not None: changes.append( JSONPatchOp( op='replace', path=f'/steps/{index}/description', value=description ) ) if status is not None: changes.append( JSONPatchOp(op='replace', path=f'/steps/{index}/status', value=status) ) return StateDeltaEvent( type=EventType.STATE_DELTA, delta=changes, ) app = agent.to_ag_ui() ``` ### Human in the Loop Demonstrates simple human in the loop workflow where the agent comes up with a plan and the user can approve it using checkboxes. #### Task Planning Tools - `generate_task_steps` - AG-UI tool to generate and confirm steps #### Task Planning Prompt ```text Generate a list of steps for cleaning a car for me to review ``` #### Human in the Loop - Code [ag_ui/api/human_in_the_loop.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/ag_ui/api/human_in_the_loop.py) ```py """Human in the Loop Feature. No special handling is required for this feature. """ from __future__ import annotations from textwrap import dedent from pydantic_ai import Agent agent = Agent( 'openai:gpt-4o-mini', instructions=dedent( """ When planning tasks use tools only, without any other messages. IMPORTANT: - Use the `generate_task_steps` tool to display the suggested steps to the user - Never repeat the plan, or send a message detailing steps - If accepted, confirm the creation of the plan and the number of selected (enabled) steps only - If not accepted, ask the user for more information, DO NOT use the `generate_task_steps` tool again """ ), ) app = agent.to_ag_ui() ``` ### Predictive State Updates Demonstrates how to use the predictive state updates feature to update the state of the UI based on agent responses, including user interaction via user confirmation. If you've [run the example](#running-the-example), you can view it at . #### Story Tools - `write_document` - AG-UI tool to write the document to a window - `document_predict_state` - Pydantic AI tool that enables document state prediction for the `write_document` tool This also shows how to use custom instructions based on shared state information. #### Story Example Starting document text ```markdown Bruce was a good dog, ``` Agent prompt ```text Help me complete my story about bruce the dog, is should be no longer than a sentence. ``` #### Predictive State Updates - Code [ag_ui/api/predictive_state_updates.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/ag_ui/api/predictive_state_updates.py) ```py """Predictive State feature.""" from __future__ import annotations from textwrap import dedent from pydantic import BaseModel from ag_ui.core import CustomEvent, EventType from pydantic_ai import Agent, RunContext from pydantic_ai.ag_ui import StateDeps class DocumentState(BaseModel): """State for the document being written.""" document: str = '' agent = Agent('openai:gpt-4o-mini', deps_type=StateDeps[DocumentState]) # Tools which return AG-UI events will be sent to the client as part of the # event stream, single events and iterables of events are supported. @agent.tool_plain async def document_predict_state() -> list[CustomEvent]: """Enable document state prediction. Returns: CustomEvent containing the event to enable state prediction. """ return [ CustomEvent( type=EventType.CUSTOM, name='PredictState', value=[ { 'state_key': 'document', 'tool': 'write_document', 'tool_argument': 'document', }, ], ), ] @agent.instructions() async def story_instructions(ctx: RunContext[StateDeps[DocumentState]]) -> str: """Provide instructions for writing document if present. Args: ctx: The run context containing document state information. Returns: Instructions string for the document writing agent. """ return dedent( f"""You are a helpful assistant for writing documents. Before you start writing, you MUST call the `document_predict_state` tool to enable state prediction. To present the document to the user for review, you MUST use the `write_document` tool. When you have written the document, DO NOT repeat it as a message. If accepted briefly summarize the changes you made, 2 sentences max, otherwise ask the user to clarify what they want to change. This is the current document: {ctx.deps.state.document} """ ) app = agent.to_ag_ui(deps=StateDeps(DocumentState())) ``` ### Shared State Demonstrates how to use the shared state between the UI and the agent. State sent to the agent is detected by a function based instruction. This then validates the data using a custom pydantic model before using to create the instructions for the agent to follow and send to the client using a AG-UI tool. If you've [run the example](#running-the-example), you can view it at . #### Recipe Tools - `display_recipe` - AG-UI tool to display the recipe in a graphical format #### Recipe Example 1. Customise the basic settings of your recipe 1. Click `Improve with AI` #### Shared State - Code [ag_ui/api/shared_state.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/ag_ui/api/shared_state.py) ```py """Shared State feature.""" from __future__ import annotations from enum import StrEnum from textwrap import dedent from pydantic import BaseModel, Field from ag_ui.core import EventType, StateSnapshotEvent from pydantic_ai import Agent, RunContext from pydantic_ai.ag_ui import StateDeps class SkillLevel(StrEnum): """The level of skill required for the recipe.""" BEGINNER = 'Beginner' INTERMEDIATE = 'Intermediate' ADVANCED = 'Advanced' class SpecialPreferences(StrEnum): """Special preferences for the recipe.""" HIGH_PROTEIN = 'High Protein' LOW_CARB = 'Low Carb' SPICY = 'Spicy' BUDGET_FRIENDLY = 'Budget-Friendly' ONE_POT_MEAL = 'One-Pot Meal' VEGETARIAN = 'Vegetarian' VEGAN = 'Vegan' class CookingTime(StrEnum): """The cooking time of the recipe.""" FIVE_MIN = '5 min' FIFTEEN_MIN = '15 min' THIRTY_MIN = '30 min' FORTY_FIVE_MIN = '45 min' SIXTY_PLUS_MIN = '60+ min' class Ingredient(BaseModel): """A class representing an ingredient in a recipe.""" icon: str = Field( default='ingredient', description="The icon emoji (not emoji code like '\x1f35e', but the actual emoji like 🥕) of the ingredient", ) name: str amount: str class Recipe(BaseModel): """A class representing a recipe.""" skill_level: SkillLevel = Field( default=SkillLevel.BEGINNER, description='The skill level required for the recipe', ) special_preferences: list[SpecialPreferences] = Field( default_factory=list, description='Any special preferences for the recipe', ) cooking_time: CookingTime = Field( default=CookingTime.FIVE_MIN, description='The cooking time of the recipe' ) ingredients: list[Ingredient] = Field( default_factory=list, description='Ingredients for the recipe', ) instructions: list[str] = Field( default_factory=list, description='Instructions for the recipe' ) class RecipeSnapshot(BaseModel): """A class representing the state of the recipe.""" recipe: Recipe = Field( default_factory=Recipe, description='The current state of the recipe' ) agent = Agent('openai:gpt-4o-mini', deps_type=StateDeps[RecipeSnapshot]) @agent.tool_plain async def display_recipe(recipe: Recipe) -> StateSnapshotEvent: """Display the recipe to the user. Args: recipe: The recipe to display. Returns: StateSnapshotEvent containing the recipe snapshot. """ return StateSnapshotEvent( type=EventType.STATE_SNAPSHOT, snapshot={'recipe': recipe}, ) @agent.instructions async def recipe_instructions(ctx: RunContext[StateDeps[RecipeSnapshot]]) -> str: """Instructions for the recipe generation agent. Args: ctx: The run context containing recipe state information. Returns: Instructions string for the recipe generation agent. """ return dedent( f""" You are a helpful assistant for creating recipes. IMPORTANT: - Create a complete recipe using the existing ingredients - Append new ingredients to the existing ones - Use the `display_recipe` tool to present the recipe to the user - Do NOT repeat the recipe in the message, use the tool instead - Do NOT run the `display_recipe` tool multiple times in a row Once you have created the updated recipe and displayed it to the user, summarise the changes in one sentence, don't describe the recipe in detail or send it as a message to the user. The current state of the recipe is: {ctx.deps.state.recipe.model_dump_json(indent=2)} """, ) app = agent.to_ag_ui(deps=StateDeps(RecipeSnapshot())) ``` ### Tool Based Generative UI Demonstrates customised rendering for tool output with used confirmation. If you've [run the example](#running-the-example), you can view it at . #### Haiku Tools - `generate_haiku` - AG-UI tool to display a haiku in English and Japanese #### Haiku Prompt ```text Generate a haiku about formula 1 ``` #### Tool Based Generative UI - Code [ag_ui/api/tool_based_generative_ui.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/ag_ui/api/tool_based_generative_ui.py) ```py """Tool Based Generative UI feature. No special handling is required for this feature. """ from __future__ import annotations from pydantic_ai import Agent agent = Agent('openai:gpt-4o-mini') app = agent.to_ag_ui() ``` Small but complete example of using Pydantic AI to build a support agent for a bank. Demonstrates: - [dynamic system prompt](../../agents/#system-prompts) - [structured `output_type`](../../output/#structured-output) - [tools](../../tools/) ## Running the Example With [dependencies installed and environment variables set](../setup/#usage), run: ```bash python -m pydantic_ai_examples.bank_support ``` ```bash uv run -m pydantic_ai_examples.bank_support ``` (or `PYDANTIC_AI_MODEL=gemini-1.5-flash ...`) ## Example Code [bank_support.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/bank_support.py) ```py """Small but complete example of using Pydantic AI to build a support agent for a bank. Run with: uv run -m pydantic_ai_examples.bank_support """ from dataclasses import dataclass from pydantic import BaseModel from pydantic_ai import Agent, RunContext class DatabaseConn: """This is a fake database for example purposes. In reality, you'd be connecting to an external database (e.g. PostgreSQL) to get information about customers. """ @classmethod async def customer_name(cls, *, id: int) -> str | None: if id == 123: return 'John' @classmethod async def customer_balance(cls, *, id: int, include_pending: bool) -> float: if id == 123: if include_pending: return 123.45 else: return 100.00 else: raise ValueError('Customer not found') @dataclass class SupportDependencies: customer_id: int db: DatabaseConn class SupportOutput(BaseModel): support_advice: str """Advice returned to the customer""" block_card: bool """Whether to block their card or not""" risk: int """Risk level of query""" support_agent = Agent( 'openai:gpt-4o', deps_type=SupportDependencies, output_type=SupportOutput, instructions=( 'You are a support agent in our bank, give the ' 'customer support and judge the risk level of their query. ' "Reply using the customer's name." ), ) @support_agent.instructions async def add_customer_name(ctx: RunContext[SupportDependencies]) -> str: customer_name = await ctx.deps.db.customer_name(id=ctx.deps.customer_id) return f"The customer's name is {customer_name!r}" @support_agent.tool async def customer_balance( ctx: RunContext[SupportDependencies], include_pending: bool ) -> str: """Returns the customer's current account balance.""" balance = await ctx.deps.db.customer_balance( id=ctx.deps.customer_id, include_pending=include_pending, ) return f'${balance:.2f}' if __name__ == '__main__': deps = SupportDependencies(customer_id=123, db=DatabaseConn()) result = support_agent.run_sync('What is my balance?', deps=deps) print(result.output) """ support_advice='Hello John, your current account balance, including pending transactions, is $123.45.' block_card=False risk=1 """ result = support_agent.run_sync('I just lost my card!', deps=deps) print(result.output) """ support_advice="I'm sorry to hear that, John. We are temporarily blocking your card to prevent unauthorized transactions." block_card=True risk=8 """ ``` # Chat App with FastAPI Simple chat app example build with FastAPI. Demonstrates: - [reusing chat history](../../message-history/) - [serializing messages](../../message-history/#accessing-messages-from-results) - [streaming responses](../../output/#streamed-results) This demonstrates storing chat history between requests and using it to give the model context for new responses. Most of the complex logic here is between `chat_app.py` which streams the response to the browser, and `chat_app.ts` which renders messages in the browser. ## Running the Example With [dependencies installed and environment variables set](../setup/#usage), run: ```bash python -m pydantic_ai_examples.chat_app ``` ```bash uv run -m pydantic_ai_examples.chat_app ``` Then open the app at [localhost:8000](http://localhost:8000). ## Example Code Python code that runs the chat app: [chat_app.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/chat_app.py) ```py """Simple chat app example build with FastAPI. Run with: uv run -m pydantic_ai_examples.chat_app """ from __future__ import annotations as _annotations import asyncio import json import sqlite3 from collections.abc import AsyncIterator, Callable from concurrent.futures.thread import ThreadPoolExecutor from contextlib import asynccontextmanager from dataclasses import dataclass from datetime import datetime, timezone from functools import partial from pathlib import Path from typing import Annotated, Any, Literal, TypeVar import fastapi import logfire from fastapi import Depends, Request from fastapi.responses import FileResponse, Response, StreamingResponse from typing_extensions import LiteralString, ParamSpec, TypedDict from pydantic_ai import Agent, UnexpectedModelBehavior from pydantic_ai.messages import ( ModelMessage, ModelMessagesTypeAdapter, ModelRequest, ModelResponse, TextPart, UserPromptPart, ) # 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured logfire.configure(send_to_logfire='if-token-present') logfire.instrument_pydantic_ai() agent = Agent('openai:gpt-4o') THIS_DIR = Path(__file__).parent @asynccontextmanager async def lifespan(_app: fastapi.FastAPI): async with Database.connect() as db: yield {'db': db} app = fastapi.FastAPI(lifespan=lifespan) logfire.instrument_fastapi(app) @app.get('/') async def index() -> FileResponse: return FileResponse((THIS_DIR / 'chat_app.html'), media_type='text/html') @app.get('/chat_app.ts') async def main_ts() -> FileResponse: """Get the raw typescript code, it's compiled in the browser, forgive me.""" return FileResponse((THIS_DIR / 'chat_app.ts'), media_type='text/plain') async def get_db(request: Request) -> Database: return request.state.db @app.get('/chat/') async def get_chat(database: Database = Depends(get_db)) -> Response: msgs = await database.get_messages() return Response( b'\n'.join(json.dumps(to_chat_message(m)).encode('utf-8') for m in msgs), media_type='text/plain', ) class ChatMessage(TypedDict): """Format of messages sent to the browser.""" role: Literal['user', 'model'] timestamp: str content: str def to_chat_message(m: ModelMessage) -> ChatMessage: first_part = m.parts[0] if isinstance(m, ModelRequest): if isinstance(first_part, UserPromptPart): assert isinstance(first_part.content, str) return { 'role': 'user', 'timestamp': first_part.timestamp.isoformat(), 'content': first_part.content, } elif isinstance(m, ModelResponse): if isinstance(first_part, TextPart): return { 'role': 'model', 'timestamp': m.timestamp.isoformat(), 'content': first_part.content, } raise UnexpectedModelBehavior(f'Unexpected message type for chat app: {m}') @app.post('/chat/') async def post_chat( prompt: Annotated[str, fastapi.Form()], database: Database = Depends(get_db) ) -> StreamingResponse: async def stream_messages(): """Streams new line delimited JSON `Message`s to the client.""" # stream the user prompt so that can be displayed straight away yield ( json.dumps( { 'role': 'user', 'timestamp': datetime.now(tz=timezone.utc).isoformat(), 'content': prompt, } ).encode('utf-8') + b'\n' ) # get the chat history so far to pass as context to the agent messages = await database.get_messages() # run the agent with the user prompt and the chat history async with agent.run_stream(prompt, message_history=messages) as result: async for text in result.stream_output(debounce_by=0.01): # text here is a `str` and the frontend wants # JSON encoded ModelResponse, so we create one m = ModelResponse(parts=[TextPart(text)], timestamp=result.timestamp()) yield json.dumps(to_chat_message(m)).encode('utf-8') + b'\n' # add new messages (e.g. the user prompt and the agent response in this case) to the database await database.add_messages(result.new_messages_json()) return StreamingResponse(stream_messages(), media_type='text/plain') P = ParamSpec('P') R = TypeVar('R') @dataclass class Database: """Rudimentary database to store chat messages in SQLite. The SQLite standard library package is synchronous, so we use a thread pool executor to run queries asynchronously. """ con: sqlite3.Connection _loop: asyncio.AbstractEventLoop _executor: ThreadPoolExecutor @classmethod @asynccontextmanager async def connect( cls, file: Path = THIS_DIR / '.chat_app_messages.sqlite' ) -> AsyncIterator[Database]: with logfire.span('connect to DB'): loop = asyncio.get_event_loop() executor = ThreadPoolExecutor(max_workers=1) con = await loop.run_in_executor(executor, cls._connect, file) slf = cls(con, loop, executor) try: yield slf finally: await slf._asyncify(con.close) @staticmethod def _connect(file: Path) -> sqlite3.Connection: con = sqlite3.connect(str(file)) con = logfire.instrument_sqlite3(con) cur = con.cursor() cur.execute( 'CREATE TABLE IF NOT EXISTS messages (id INT PRIMARY KEY, message_list TEXT);' ) con.commit() return con async def add_messages(self, messages: bytes): await self._asyncify( self._execute, 'INSERT INTO messages (message_list) VALUES (?);', messages, commit=True, ) await self._asyncify(self.con.commit) async def get_messages(self) -> list[ModelMessage]: c = await self._asyncify( self._execute, 'SELECT message_list FROM messages order by id' ) rows = await self._asyncify(c.fetchall) messages: list[ModelMessage] = [] for row in rows: messages.extend(ModelMessagesTypeAdapter.validate_json(row[0])) return messages def _execute( self, sql: LiteralString, *args: Any, commit: bool = False ) -> sqlite3.Cursor: cur = self.con.cursor() cur.execute(sql, args) if commit: self.con.commit() return cur async def _asyncify( self, func: Callable[P, R], *args: P.args, **kwargs: P.kwargs ) -> R: return await self._loop.run_in_executor( # type: ignore self._executor, partial(func, **kwargs), *args, # type: ignore ) if __name__ == '__main__': import uvicorn uvicorn.run( 'pydantic_ai_examples.chat_app:app', reload=True, reload_dirs=[str(THIS_DIR)] ) ``` Simple HTML page to render the app: [chat_app.html](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/chat_app.html) ```html Chat App

Chat App

Ask me anything...

Error occurred, check the browser developer console for more information.
``` TypeScript to handle rendering the messages, to keep this simple (and at the risk of offending frontend developers) the typescript code is passed to the browser as plain text and transpiled in the browser. [chat_app.ts](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/chat_app.ts) ```ts // BIG FAT WARNING: to avoid the complexity of npm, this typescript is compiled in the browser // there's currently no static type checking import { marked } from 'https://cdnjs.cloudflare.com/ajax/libs/marked/15.0.0/lib/marked.esm.js' const convElement = document.getElementById('conversation') const promptInput = document.getElementById('prompt-input') as HTMLInputElement const spinner = document.getElementById('spinner') // stream the response and render messages as each chunk is received // data is sent as newline-delimited JSON async function onFetchResponse(response: Response): Promise { let text = '' let decoder = new TextDecoder() if (response.ok) { const reader = response.body.getReader() while (true) { const {done, value} = await reader.read() if (done) { break } text += decoder.decode(value) addMessages(text) spinner.classList.remove('active') } addMessages(text) promptInput.disabled = false promptInput.focus() } else { const text = await response.text() console.error(`Unexpected response: ${response.status}`, {response, text}) throw new Error(`Unexpected response: ${response.status}`) } } // The format of messages, this matches pydantic-ai both for brevity and understanding // in production, you might not want to keep this format all the way to the frontend interface Message { role: string content: string timestamp: string } // take raw response text and render messages into the `#conversation` element // Message timestamp is assumed to be a unique identifier of a message, and is used to deduplicate // hence you can send data about the same message multiple times, and it will be updated // instead of creating a new message elements function addMessages(responseText: string) { const lines = responseText.split('\n') const messages: Message[] = lines.filter(line => line.length > 1).map(j => JSON.parse(j)) for (const message of messages) { // we use the timestamp as a crude element id const {timestamp, role, content} = message const id = `msg-${timestamp}` let msgDiv = document.getElementById(id) if (!msgDiv) { msgDiv = document.createElement('div') msgDiv.id = id msgDiv.title = `${role} at ${timestamp}` msgDiv.classList.add('border-top', 'pt-2', role) convElement.appendChild(msgDiv) } msgDiv.innerHTML = marked.parse(content) } window.scrollTo({ top: document.body.scrollHeight, behavior: 'smooth' }) } function onError(error: any) { console.error(error) document.getElementById('error').classList.remove('d-none') document.getElementById('spinner').classList.remove('active') } async function onSubmit(e: SubmitEvent): Promise { e.preventDefault() spinner.classList.add('active') const body = new FormData(e.target as HTMLFormElement) promptInput.value = '' promptInput.disabled = true const response = await fetch('/chat/', {method: 'POST', body}) await onFetchResponse(response) } // call onSubmit when the form is submitted (e.g. user clicks the send button or hits Enter) document.querySelector('form').addEventListener('submit', (e) => onSubmit(e).catch(onError)) // load messages on page load fetch('/chat/').then(onFetchResponse).catch(onError) ``` # Data Analyst Sometimes in an agent workflow, the agent does not need to know the exact tool output, but still needs to process the tool output in some ways. This is especially common in data analytics: the agent needs to know that the result of a query tool is a `DataFrame` with certain named columns, but not necessarily the content of every single row. With Pydantic AI, you can use a [dependencies object](../../dependencies/) to store the result from one tool and use it in another tool. In this example, we'll build an agent that analyzes the [Rotten Tomatoes movie review dataset from Cornell](https://huggingface.co/datasets/cornell-movie-review-data/rotten_tomatoes). Demonstrates: - [agent dependencies](../../dependencies/) ## Running the Example With [dependencies installed and environment variables set](../setup/#usage), run: ```bash python -m pydantic_ai_examples.data_analyst ``` ```bash uv run -m pydantic_ai_examples.data_analyst ``` Output (debug): > Based on my analysis of the Cornell Movie Review dataset (rotten_tomatoes), there are **4,265 negative comments** in the training split. These are the reviews labeled as 'neg' (represented by 0 in the dataset). ## Example Code [data_analyst.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/data_analyst.py) ```py from dataclasses import dataclass, field import datasets import duckdb import pandas as pd from pydantic_ai import Agent, ModelRetry, RunContext @dataclass class AnalystAgentDeps: output: dict[str, pd.DataFrame] = field(default_factory=dict) def store(self, value: pd.DataFrame) -> str: """Store the output in deps and return the reference such as Out[1] to be used by the LLM.""" ref = f'Out[{len(self.output) + 1}]' self.output[ref] = value return ref def get(self, ref: str) -> pd.DataFrame: if ref not in self.output: raise ModelRetry( f'Error: {ref} is not a valid variable reference. Check the previous messages and try again.' ) return self.output[ref] analyst_agent = Agent( 'openai:gpt-4o', deps_type=AnalystAgentDeps, instructions='You are a data analyst and your job is to analyze the data according to the user request.', ) @analyst_agent.tool def load_dataset( ctx: RunContext[AnalystAgentDeps], path: str, split: str = 'train', ) -> str: """Load the `split` of dataset `dataset_name` from huggingface. Args: ctx: Pydantic AI agent RunContext path: name of the dataset in the form of `/` split: load the split of the dataset (default: "train") """ # begin load data from hf builder = datasets.load_dataset_builder(path) # pyright: ignore[reportUnknownMemberType] splits: dict[str, datasets.SplitInfo] = builder.info.splits or {} # pyright: ignore[reportUnknownMemberType] if split not in splits: raise ModelRetry( f'{split} is not valid for dataset {path}. Valid splits are {",".join(splits.keys())}' ) builder.download_and_prepare() # pyright: ignore[reportUnknownMemberType] dataset = builder.as_dataset(split=split) assert isinstance(dataset, datasets.Dataset) dataframe = dataset.to_pandas() assert isinstance(dataframe, pd.DataFrame) # end load data from hf # store the dataframe in the deps and get a ref like "Out[1]" ref = ctx.deps.store(dataframe) # construct a summary of the loaded dataset output = [ f'Loaded the dataset as `{ref}`.', f'Description: {dataset.info.description}' if dataset.info.description else None, f'Features: {dataset.info.features!r}' if dataset.info.features else None, ] return '\n'.join(filter(None, output)) @analyst_agent.tool def run_duckdb(ctx: RunContext[AnalystAgentDeps], dataset: str, sql: str) -> str: """Run DuckDB SQL query on the DataFrame. Note that the virtual table name used in DuckDB SQL must be `dataset`. Args: ctx: Pydantic AI agent RunContext dataset: reference string to the DataFrame sql: the query to be executed using DuckDB """ data = ctx.deps.get(dataset) result = duckdb.query_df(df=data, virtual_table_name='dataset', sql_query=sql) # pass the result as ref (because DuckDB SQL can select many rows, creating another huge dataframe) ref = ctx.deps.store(result.df()) # pyright: ignore[reportUnknownMemberType] return f'Executed SQL, result is `{ref}`' @analyst_agent.tool def display(ctx: RunContext[AnalystAgentDeps], name: str) -> str: """Display at most 5 rows of the dataframe.""" dataset = ctx.deps.get(name) return dataset.head().to_string() # pyright: ignore[reportUnknownMemberType] if __name__ == '__main__': deps = AnalystAgentDeps() result = analyst_agent.run_sync( user_prompt='Count how many negative comments are there in the dataset `cornell-movie-review-data/rotten_tomatoes`', deps=deps, ) print(result.output) ``` ## Appendix ### Choosing a Model This example requires using a model that understands DuckDB SQL. You can check with `clai`: ```sh > clai -m bedrock:us.anthropic.claude-3-7-sonnet-20250219-v1:0 clai - Pydantic AI CLI v0.0.1.dev920+41dd069 with bedrock:us.anthropic.claude-3-7-sonnet-20250219-v1:0 clai ➤ do you understand duckdb sql? # DuckDB SQL Yes, I understand DuckDB SQL. DuckDB is an in-process analytical SQL database that uses syntax similar to PostgreSQL. It specializes in analytical queries and is designed for high-performance analysis of structured data. Some key features of DuckDB SQL include: • OLAP (Online Analytical Processing) optimized • Columnar-vectorized query execution • Standard SQL support with PostgreSQL compatibility • Support for complex analytical queries • Efficient handling of CSV/Parquet/JSON files I can help you with DuckDB SQL queries, schema design, optimization, or other DuckDB-related questions. ``` Example of a multi-agent flow where one agent delegates work to another, then hands off control to a third agent. Demonstrates: - [agent delegation](../../multi-agent-applications/#agent-delegation) - [programmatic agent hand-off](../../multi-agent-applications/#programmatic-agent-hand-off) - [usage limits](../../agents/#usage-limits) In this scenario, a group of agents work together to find the best flight for a user. The control flow for this example can be summarised as follows: ``` graph TD START --> search_agent("search agent") search_agent --> extraction_agent("extraction agent") extraction_agent --> search_agent search_agent --> human_confirm("human confirm") human_confirm --> search_agent search_agent --> FAILED human_confirm --> find_seat_function("find seat function") find_seat_function --> human_seat_choice("human seat choice") human_seat_choice --> find_seat_agent("find seat agent") find_seat_agent --> find_seat_function find_seat_function --> buy_flights("buy flights") buy_flights --> SUCCESS ``` ## Running the Example With [dependencies installed and environment variables set](../setup/#usage), run: ```bash python -m pydantic_ai_examples.flight_booking ``` ```bash uv run -m pydantic_ai_examples.flight_booking ``` ## Example Code [flight_booking.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/flight_booking.py) ```py """Example of a multi-agent flow where one agent delegates work to another. In this scenario, a group of agents work together to find flights for a user. """ import datetime from dataclasses import dataclass from typing import Literal import logfire from pydantic import BaseModel, Field from rich.prompt import Prompt from pydantic_ai import Agent, ModelRetry, RunContext, RunUsage, UsageLimits from pydantic_ai.messages import ModelMessage # 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured logfire.configure(send_to_logfire='if-token-present') logfire.instrument_pydantic_ai() class FlightDetails(BaseModel): """Details of the most suitable flight.""" flight_number: str price: int origin: str = Field(description='Three-letter airport code') destination: str = Field(description='Three-letter airport code') date: datetime.date class NoFlightFound(BaseModel): """When no valid flight is found.""" @dataclass class Deps: web_page_text: str req_origin: str req_destination: str req_date: datetime.date # This agent is responsible for controlling the flow of the conversation. search_agent = Agent[Deps, FlightDetails | NoFlightFound]( 'openai:gpt-4o', output_type=FlightDetails | NoFlightFound, # type: ignore retries=4, system_prompt=( 'Your job is to find the cheapest flight for the user on the given date. ' ), ) # This agent is responsible for extracting flight details from web page text. extraction_agent = Agent( 'openai:gpt-4o', output_type=list[FlightDetails], system_prompt='Extract all the flight details from the given text.', ) @search_agent.tool async def extract_flights(ctx: RunContext[Deps]) -> list[FlightDetails]: """Get details of all flights.""" # we pass the usage to the search agent so requests within this agent are counted result = await extraction_agent.run(ctx.deps.web_page_text, usage=ctx.usage) logfire.info('found {flight_count} flights', flight_count=len(result.output)) return result.output @search_agent.output_validator async def validate_output( ctx: RunContext[Deps], output: FlightDetails | NoFlightFound ) -> FlightDetails | NoFlightFound: """Procedural validation that the flight meets the constraints.""" if isinstance(output, NoFlightFound): return output errors: list[str] = [] if output.origin != ctx.deps.req_origin: errors.append( f'Flight should have origin {ctx.deps.req_origin}, not {output.origin}' ) if output.destination != ctx.deps.req_destination: errors.append( f'Flight should have destination {ctx.deps.req_destination}, not {output.destination}' ) if output.date != ctx.deps.req_date: errors.append(f'Flight should be on {ctx.deps.req_date}, not {output.date}') if errors: raise ModelRetry('\n'.join(errors)) else: return output class SeatPreference(BaseModel): row: int = Field(ge=1, le=30) seat: Literal['A', 'B', 'C', 'D', 'E', 'F'] class Failed(BaseModel): """Unable to extract a seat selection.""" # This agent is responsible for extracting the user's seat selection seat_preference_agent = Agent[None, SeatPreference | Failed]( 'openai:gpt-4o', output_type=SeatPreference | Failed, system_prompt=( "Extract the user's seat preference. " 'Seats A and F are window seats. ' 'Row 1 is the front row and has extra leg room. ' 'Rows 14, and 20 also have extra leg room. ' ), ) # in reality this would be downloaded from a booking site, # potentially using another agent to navigate the site flights_web_page = """ 1. Flight SFO-AK123 - Price: $350 - Origin: San Francisco International Airport (SFO) - Destination: Ted Stevens Anchorage International Airport (ANC) - Date: January 10, 2025 2. Flight SFO-AK456 - Price: $370 - Origin: San Francisco International Airport (SFO) - Destination: Fairbanks International Airport (FAI) - Date: January 10, 2025 3. Flight SFO-AK789 - Price: $400 - Origin: San Francisco International Airport (SFO) - Destination: Juneau International Airport (JNU) - Date: January 20, 2025 4. Flight NYC-LA101 - Price: $250 - Origin: San Francisco International Airport (SFO) - Destination: Ted Stevens Anchorage International Airport (ANC) - Date: January 10, 2025 5. Flight CHI-MIA202 - Price: $200 - Origin: Chicago O'Hare International Airport (ORD) - Destination: Miami International Airport (MIA) - Date: January 12, 2025 6. Flight BOS-SEA303 - Price: $120 - Origin: Boston Logan International Airport (BOS) - Destination: Ted Stevens Anchorage International Airport (ANC) - Date: January 12, 2025 7. Flight DFW-DEN404 - Price: $150 - Origin: Dallas/Fort Worth International Airport (DFW) - Destination: Denver International Airport (DEN) - Date: January 10, 2025 8. Flight ATL-HOU505 - Price: $180 - Origin: Hartsfield-Jackson Atlanta International Airport (ATL) - Destination: George Bush Intercontinental Airport (IAH) - Date: January 10, 2025 """ # restrict how many requests this app can make to the LLM usage_limits = UsageLimits(request_limit=15) async def main(): deps = Deps( web_page_text=flights_web_page, req_origin='SFO', req_destination='ANC', req_date=datetime.date(2025, 1, 10), ) message_history: list[ModelMessage] | None = None usage: RunUsage = RunUsage() # run the agent until a satisfactory flight is found while True: result = await search_agent.run( f'Find me a flight from {deps.req_origin} to {deps.req_destination} on {deps.req_date}', deps=deps, usage=usage, message_history=message_history, usage_limits=usage_limits, ) if isinstance(result.output, NoFlightFound): print('No flight found') break else: flight = result.output print(f'Flight found: {flight}') answer = Prompt.ask( 'Do you want to buy this flight, or keep searching? (buy/*search)', choices=['buy', 'search', ''], show_choices=False, ) if answer == 'buy': seat = await find_seat(usage) await buy_tickets(flight, seat) break else: message_history = result.all_messages( output_tool_return_content='Please suggest another flight' ) async def find_seat(usage: RunUsage) -> SeatPreference: message_history: list[ModelMessage] | None = None while True: answer = Prompt.ask('What seat would you like?') result = await seat_preference_agent.run( answer, message_history=message_history, usage=usage, usage_limits=usage_limits, ) if isinstance(result.output, SeatPreference): return result.output else: print('Could not understand seat preference. Please try again.') message_history = result.all_messages() async def buy_tickets(flight_details: FlightDetails, seat: SeatPreference): print(f'Purchasing flight {flight_details=!r} {seat=!r}...') if __name__ == '__main__': import asyncio asyncio.run(main()) ``` # Pydantic Model Simple example of using Pydantic AI to construct a Pydantic model from a text input. Demonstrates: - [structured `output_type`](../../output/#structured-output) ## Running the Example With [dependencies installed and environment variables set](../setup/#usage), run: ```bash python -m pydantic_ai_examples.pydantic_model ``` ```bash uv run -m pydantic_ai_examples.pydantic_model ``` This examples uses `openai:gpt-4o` by default, but it works well with other models, e.g. you can run it with Gemini using: ```bash PYDANTIC_AI_MODEL=gemini-1.5-pro python -m pydantic_ai_examples.pydantic_model ``` ```bash PYDANTIC_AI_MODEL=gemini-1.5-pro uv run -m pydantic_ai_examples.pydantic_model ``` (or `PYDANTIC_AI_MODEL=gemini-1.5-flash ...`) ## Example Code [pydantic_model.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/pydantic_model.py) ```py """Simple example of using Pydantic AI to construct a Pydantic model from a text input. Run with: uv run -m pydantic_ai_examples.pydantic_model """ import os import logfire from pydantic import BaseModel from pydantic_ai import Agent # 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured logfire.configure(send_to_logfire='if-token-present') logfire.instrument_pydantic_ai() class MyModel(BaseModel): city: str country: str model = os.getenv('PYDANTIC_AI_MODEL', 'openai:gpt-4o') print(f'Using model: {model}') agent = Agent(model, output_type=MyModel) if __name__ == '__main__': result = agent.run_sync('The windy city in the US of A.') print(result.output) print(result.usage()) ``` # Question Graph Example of a graph for asking and evaluating questions. Demonstrates: - [`pydantic_graph`](../../graph/) ## Running the Example With [dependencies installed and environment variables set](../setup/#usage), run: ```bash python -m pydantic_ai_examples.question_graph ``` ```bash uv run -m pydantic_ai_examples.question_graph ``` ## Example Code [question_graph.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/question_graph.py) ```py """Example of a graph for asking and evaluating questions. Run with: uv run -m pydantic_ai_examples.question_graph """ from __future__ import annotations as _annotations from dataclasses import dataclass, field from pathlib import Path import logfire from groq import BaseModel from pydantic_ai import Agent, format_as_xml from pydantic_ai.messages import ModelMessage from pydantic_graph import ( BaseNode, End, Graph, GraphRunContext, ) from pydantic_graph.persistence.file import FileStatePersistence # 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured logfire.configure(send_to_logfire='if-token-present') logfire.instrument_pydantic_ai() ask_agent = Agent('openai:gpt-4o', output_type=str) @dataclass class QuestionState: question: str | None = None ask_agent_messages: list[ModelMessage] = field(default_factory=list) evaluate_agent_messages: list[ModelMessage] = field(default_factory=list) @dataclass class Ask(BaseNode[QuestionState]): async def run(self, ctx: GraphRunContext[QuestionState]) -> Answer: result = await ask_agent.run( 'Ask a simple question with a single correct answer.', message_history=ctx.state.ask_agent_messages, ) ctx.state.ask_agent_messages += result.all_messages() ctx.state.question = result.output return Answer(result.output) @dataclass class Answer(BaseNode[QuestionState]): question: str async def run(self, ctx: GraphRunContext[QuestionState]) -> Evaluate: answer = input(f'{self.question}: ') return Evaluate(answer) class EvaluationOutput(BaseModel, use_attribute_docstrings=True): correct: bool """Whether the answer is correct.""" comment: str """Comment on the answer, reprimand the user if the answer is wrong.""" evaluate_agent = Agent( 'openai:gpt-4o', output_type=EvaluationOutput, system_prompt='Given a question and answer, evaluate if the answer is correct.', ) @dataclass class Evaluate(BaseNode[QuestionState, None, str]): answer: str async def run( self, ctx: GraphRunContext[QuestionState], ) -> End[str] | Reprimand: assert ctx.state.question is not None result = await evaluate_agent.run( format_as_xml({'question': ctx.state.question, 'answer': self.answer}), message_history=ctx.state.evaluate_agent_messages, ) ctx.state.evaluate_agent_messages += result.all_messages() if result.output.correct: return End(result.output.comment) else: return Reprimand(result.output.comment) @dataclass class Reprimand(BaseNode[QuestionState]): comment: str async def run(self, ctx: GraphRunContext[QuestionState]) -> Ask: print(f'Comment: {self.comment}') ctx.state.question = None return Ask() question_graph = Graph( nodes=(Ask, Answer, Evaluate, Reprimand), state_type=QuestionState ) async def run_as_continuous(): state = QuestionState() node = Ask() end = await question_graph.run(node, state=state) print('END:', end.output) async def run_as_cli(answer: str | None): persistence = FileStatePersistence(Path('question_graph.json')) persistence.set_graph_types(question_graph) if snapshot := await persistence.load_next(): state = snapshot.state assert answer is not None, ( 'answer required, usage "uv run -m pydantic_ai_examples.question_graph cli "' ) node = Evaluate(answer) else: state = QuestionState() node = Ask() # debug(state, node) async with question_graph.iter(node, state=state, persistence=persistence) as run: while True: node = await run.next() if isinstance(node, End): print('END:', node.data) history = await persistence.load_all() print('history:', '\n'.join(str(e.node) for e in history), sep='\n') print('Finished!') break elif isinstance(node, Answer): print(node.question) break # otherwise just continue if __name__ == '__main__': import asyncio import sys try: sub_command = sys.argv[1] assert sub_command in ('continuous', 'cli', 'mermaid') except (IndexError, AssertionError): print( 'Usage:\n' ' uv run -m pydantic_ai_examples.question_graph mermaid\n' 'or:\n' ' uv run -m pydantic_ai_examples.question_graph continuous\n' 'or:\n' ' uv run -m pydantic_ai_examples.question_graph cli [answer]', file=sys.stderr, ) sys.exit(1) if sub_command == 'mermaid': print(question_graph.mermaid_code(start_node=Ask)) elif sub_command == 'continuous': asyncio.run(run_as_continuous()) else: a = sys.argv[2] if len(sys.argv) > 2 else None asyncio.run(run_as_cli(a)) ``` The mermaid diagram generated in this example looks like this: ``` --- title: question_graph --- stateDiagram-v2 [*] --> Ask Ask --> Answer: ask the question Answer --> Evaluate: answer the question Evaluate --> Congratulate Evaluate --> Castigate Congratulate --> [*]: success Castigate --> Ask: try again ``` # RAG RAG search example. This demo allows you to ask question of the [logfire](https://pydantic.dev/logfire) documentation. Demonstrates: - [tools](../../tools/) - [agent dependencies](../../dependencies/) - RAG search This is done by creating a database containing each section of the markdown documentation, then registering the search tool with the Pydantic AI agent. Logic for extracting sections from markdown files and a JSON file with that data is available in [this gist](https://gist.github.com/samuelcolvin/4b5bb9bb163b1122ff17e29e48c10992). [PostgreSQL with pgvector](https://github.com/pgvector/pgvector) is used as the search database, the easiest way to download and run pgvector is using Docker: ```bash mkdir postgres-data docker run --rm \ -e POSTGRES_PASSWORD=postgres \ -p 54320:5432 \ -v `pwd`/postgres-data:/var/lib/postgresql/data \ pgvector/pgvector:pg17 ``` As with the [SQL gen](../sql-gen/) example, we run postgres on port `54320` to avoid conflicts with any other postgres instances you may have running. We also mount the PostgreSQL `data` directory locally to persist the data if you need to stop and restart the container. With that running and [dependencies installed and environment variables set](../setup/#usage), we can build the search database with (**WARNING**: this requires the `OPENAI_API_KEY` env variable and will calling the OpenAI embedding API around 300 times to generate embeddings for each section of the documentation): ```bash python -m pydantic_ai_examples.rag build ``` ```bash uv run -m pydantic_ai_examples.rag build ``` (Note building the database doesn't use Pydantic AI right now, instead it uses the OpenAI SDK directly.) You can then ask the agent a question with: ```bash python -m pydantic_ai_examples.rag search "How do I configure logfire to work with FastAPI?" ``` ```bash uv run -m pydantic_ai_examples.rag search "How do I configure logfire to work with FastAPI?" ``` ## Example Code [rag.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/rag.py) ```py """RAG example with pydantic-ai — using vector search to augment a chat agent. Run pgvector with: mkdir postgres-data docker run --rm -e POSTGRES_PASSWORD=postgres \ -p 54320:5432 \ -v `pwd`/postgres-data:/var/lib/postgresql/data \ pgvector/pgvector:pg17 Build the search DB with: uv run -m pydantic_ai_examples.rag build Ask the agent a question with: uv run -m pydantic_ai_examples.rag search "How do I configure logfire to work with FastAPI?" """ from __future__ import annotations as _annotations import asyncio import re import sys import unicodedata from contextlib import asynccontextmanager from dataclasses import dataclass import asyncpg import httpx import logfire import pydantic_core from openai import AsyncOpenAI from pydantic import TypeAdapter from typing_extensions import AsyncGenerator from pydantic_ai import Agent, RunContext # 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured logfire.configure(send_to_logfire='if-token-present') logfire.instrument_asyncpg() logfire.instrument_pydantic_ai() @dataclass class Deps: openai: AsyncOpenAI pool: asyncpg.Pool agent = Agent('openai:gpt-4o', deps_type=Deps) @agent.tool async def retrieve(context: RunContext[Deps], search_query: str) -> str: """Retrieve documentation sections based on a search query. Args: context: The call context. search_query: The search query. """ with logfire.span( 'create embedding for {search_query=}', search_query=search_query ): embedding = await context.deps.openai.embeddings.create( input=search_query, model='text-embedding-3-small', ) assert len(embedding.data) == 1, ( f'Expected 1 embedding, got {len(embedding.data)}, doc query: {search_query!r}' ) embedding = embedding.data[0].embedding embedding_json = pydantic_core.to_json(embedding).decode() rows = await context.deps.pool.fetch( 'SELECT url, title, content FROM doc_sections ORDER BY embedding <-> $1 LIMIT 8', embedding_json, ) return '\n\n'.join( f'# {row["title"]}\nDocumentation URL:{row["url"]}\n\n{row["content"]}\n' for row in rows ) async def run_agent(question: str): """Entry point to run the agent and perform RAG based question answering.""" openai = AsyncOpenAI() logfire.instrument_openai(openai) logfire.info('Asking "{question}"', question=question) async with database_connect(False) as pool: deps = Deps(openai=openai, pool=pool) answer = await agent.run(question, deps=deps) print(answer.output) ####################################################### # The rest of this file is dedicated to preparing the # # search database, and some utilities. # ####################################################### # JSON document from # https://gist.github.com/samuelcolvin/4b5bb9bb163b1122ff17e29e48c10992 DOCS_JSON = ( 'https://gist.githubusercontent.com/' 'samuelcolvin/4b5bb9bb163b1122ff17e29e48c10992/raw/' '80c5925c42f1442c24963aaf5eb1a324d47afe95/logfire_docs.json' ) async def build_search_db(): """Build the search database.""" async with httpx.AsyncClient() as client: response = await client.get(DOCS_JSON) response.raise_for_status() sections = sessions_ta.validate_json(response.content) openai = AsyncOpenAI() logfire.instrument_openai(openai) async with database_connect(True) as pool: with logfire.span('create schema'): async with pool.acquire() as conn: async with conn.transaction(): await conn.execute(DB_SCHEMA) sem = asyncio.Semaphore(10) async with asyncio.TaskGroup() as tg: for section in sections: tg.create_task(insert_doc_section(sem, openai, pool, section)) async def insert_doc_section( sem: asyncio.Semaphore, openai: AsyncOpenAI, pool: asyncpg.Pool, section: DocsSection, ) -> None: async with sem: url = section.url() exists = await pool.fetchval('SELECT 1 FROM doc_sections WHERE url = $1', url) if exists: logfire.info('Skipping {url=}', url=url) return with logfire.span('create embedding for {url=}', url=url): embedding = await openai.embeddings.create( input=section.embedding_content(), model='text-embedding-3-small', ) assert len(embedding.data) == 1, ( f'Expected 1 embedding, got {len(embedding.data)}, doc section: {section}' ) embedding = embedding.data[0].embedding embedding_json = pydantic_core.to_json(embedding).decode() await pool.execute( 'INSERT INTO doc_sections (url, title, content, embedding) VALUES ($1, $2, $3, $4)', url, section.title, section.content, embedding_json, ) @dataclass class DocsSection: id: int parent: int | None path: str level: int title: str content: str def url(self) -> str: url_path = re.sub(r'\.md$', '', self.path) return ( f'https://logfire.pydantic.dev/docs/{url_path}/#{slugify(self.title, "-")}' ) def embedding_content(self) -> str: return '\n\n'.join((f'path: {self.path}', f'title: {self.title}', self.content)) sessions_ta = TypeAdapter(list[DocsSection]) # pyright: reportUnknownMemberType=false # pyright: reportUnknownVariableType=false @asynccontextmanager async def database_connect( create_db: bool = False, ) -> AsyncGenerator[asyncpg.Pool, None]: server_dsn, database = ( 'postgresql://postgres:postgres@localhost:54320', 'pydantic_ai_rag', ) if create_db: with logfire.span('check and create DB'): conn = await asyncpg.connect(server_dsn) try: db_exists = await conn.fetchval( 'SELECT 1 FROM pg_database WHERE datname = $1', database ) if not db_exists: await conn.execute(f'CREATE DATABASE {database}') finally: await conn.close() pool = await asyncpg.create_pool(f'{server_dsn}/{database}') try: yield pool finally: await pool.close() DB_SCHEMA = """ CREATE EXTENSION IF NOT EXISTS vector; CREATE TABLE IF NOT EXISTS doc_sections ( id serial PRIMARY KEY, url text NOT NULL UNIQUE, title text NOT NULL, content text NOT NULL, -- text-embedding-3-small returns a vector of 1536 floats embedding vector(1536) NOT NULL ); CREATE INDEX IF NOT EXISTS idx_doc_sections_embedding ON doc_sections USING hnsw (embedding vector_l2_ops); """ def slugify(value: str, separator: str, unicode: bool = False) -> str: """Slugify a string, to make it URL friendly.""" # Taken unchanged from https://github.com/Python-Markdown/markdown/blob/3.7/markdown/extensions/toc.py#L38 if not unicode: # Replace Extended Latin characters with ASCII, i.e. `žlutý` => `zluty` value = unicodedata.normalize('NFKD', value) value = value.encode('ascii', 'ignore').decode('ascii') value = re.sub(r'[^\w\s-]', '', value).strip().lower() return re.sub(rf'[{separator}\s]+', separator, value) if __name__ == '__main__': action = sys.argv[1] if len(sys.argv) > 1 else None if action == 'build': asyncio.run(build_search_db()) elif action == 'search': if len(sys.argv) == 3: q = sys.argv[2] else: q = 'How do I configure logfire to work with FastAPI?' asyncio.run(run_agent(q)) else: print( 'uv run --extra examples -m pydantic_ai_examples.rag build|search', file=sys.stderr, ) sys.exit(1) ``` # Examples Here we include some examples of how to use Pydantic AI and what it can do. ## Usage These examples are distributed with `pydantic-ai` so you can run them either by cloning the [pydantic-ai repo](https://github.com/pydantic/pydantic-ai) or by simply installing `pydantic-ai` from PyPI with `pip` or `uv`. ### Installing required dependencies Either way you'll need to install extra dependencies to run some examples, you just need to install the `examples` optional dependency group. If you've installed `pydantic-ai` via pip/uv, you can install the extra dependencies with: ```bash pip install "pydantic-ai[examples]" ``` ```bash uv add "pydantic-ai[examples]" ``` If you clone the repo, you should instead use `uv sync --extra examples` to install extra dependencies. ### Setting model environment variables These examples will need you to set up authentication with one or more of the LLMs, see the [model configuration](../../models/overview/) docs for details on how to do this. TL;DR: in most cases you'll need to set one of the following environment variables: ```bash export OPENAI_API_KEY=your-api-key ``` ```bash export GEMINI_API_KEY=your-api-key ``` ### Running Examples To run the examples (this will work whether you installed `pydantic_ai`, or cloned the repo), run: ```bash python -m pydantic_ai_examples. ``` ```bash uv run -m pydantic_ai_examples. ``` For examples, to run the very simple [`pydantic_model`](../pydantic-model/) example: ```bash python -m pydantic_ai_examples.pydantic_model ``` ```bash uv run -m pydantic_ai_examples.pydantic_model ``` If you like one-liners and you're using uv, you can run a pydantic-ai example with zero setup: ```bash OPENAI_API_KEY='your-api-key' \ uv run --with "pydantic-ai[examples]" \ -m pydantic_ai_examples.pydantic_model ``` ______________________________________________________________________ You'll probably want to edit examples in addition to just running them. You can copy the examples to a new directory with: ```bash python -m pydantic_ai_examples --copy-to examples/ ``` ```bash uv run -m pydantic_ai_examples --copy-to examples/ ``` # Slack Lead Qualifier with Modal In this example, we're going to build an agentic app that: - automatically researches each new member that joins a company's public Slack community to see how good of a fit they are for the company's commercial product, - sends this analysis into a (private) Slack channel, and - sends a daily summary of the top 5 leads from the previous 24 hours into a (different) Slack channel. We'll be deploying the app on [Modal](https://modal.com), as it lets you use Python to define an app with web endpoints, scheduled functions, and background functions, and deploy them with a CLI, without needing to set up or manage any infrastructure. It's a great way to lower the barrier for people in your organization to start building and deploying AI agents to make their jobs easier. We also add [Pydantic Logfire](https://pydantic.dev/logfire) to get observability into the app and agent as they're running in response to webhooks and the schedule ## Screenshots This is what the analysis sent into Slack will look like: This is what the corresponding trace in [Logfire](https://pydantic.dev/logfire) will look like: All of these entries can be clicked on to get more details about what happened at that step, including the full conversation with the LLM and HTTP requests and responses. ## Prerequisites If you just want to see the code without actually going through the effort of setting up the bits necessary to run it, feel free to [jump ahead](#the-code). ### Slack app You need to have a Slack workspace and the necessary permissions to create apps. 2. Create a new Slack app using the instructions at . 1. In step 2, "Requesting scopes", request the following scopes: - [`users.read`](https://docs.slack.dev/reference/scopes/users.read) - [`users.read.email`](https://docs.slack.dev/reference/scopes/users.read.email) - [`users.profile.read`](https://docs.slack.dev/reference/scopes/users.profile.read) 1. In step 3, "Installing and authorizing the app", note down the Access Token as we're going to need to store it as a Secret in Modal. 1. You can skip steps 4 and 5. We're going to need to subscribe to the `team_join` event, but at this point you don't have a webhook URL yet. 1. Create the channels the app will post into, and add the Slack app to them: - `#new-slack-leads` - `#daily-slack-leads-summary` These names are hard-coded in the example. If you want to use different channels, you can clone the repo and change them in `examples/pydantic_examples/slack_lead_qualifier/functions.py`. ### Logfire Write Token 1. If you don't have a Logfire account yet, create one on . 1. Create a new project named, for example, `slack-lead-qualifier`. 1. Generate a new Write Token and note it down, as we're going to need to store it as a Secret in Modal. ### OpenAI API Key 1. If you don't have an OpenAI account yet, create one on . 1. Create a new API Key in Settings and note it down, as we're going to need to store it as a Secret in Modal. ### Modal account 1. If you don't have a Modal account yet, create one on . 1. Create 3 Secrets of type "Custom" on : - Name: `slack`, key: `SLACK_API_KEY`, value: the Slack Access Token you generated earlier - Name: `logfire`, key: `LOGFIRE_TOKEN`, value: the Logfire Write Token you generated earlier - Name: `openai`, key: `OPENAI_API_KEY`, value: the OpenAI API Key you generated earlier ## Usage 1. Make sure you have the [dependencies installed](../setup/#usage). 1. Authenticate with Modal: ```bash python/uv-run -m modal setup ``` 1. Run the example as an [ephemeral Modal app](https://modal.com/docs/guide/apps#ephemeral-apps), meaning it will only run until you quit it using Ctrl+C: ```bash python/uv-run -m modal serve -m pydantic_ai_examples.slack_lead_qualifier.modal ``` 1. Note down the URL after `Created web function web_app =>`, this is your webhook endpoint URL. 1. Go back to and follow step 4, "Configuring the app for event listening", to subscribe to the `team_join` event with the webhook endpoint URL you noted down as the Request URL. Now when someone new (possibly you with a throwaway email) joins the Slack workspace, you'll see the webhook event being processed in the terminal where you ran `modal serve` and in the Logfire Live view, and after waiting a few seconds you should see the result appear in the `#new-slack-leads` Slack channel! Faking a Slack signup You can also fake a Slack signup event and try out the agent like this, with any name or email you please: ```bash curl -X POST \ -H "Content-Type: application/json" \ -d '{ "type": "event_callback", "event": { "type": "team_join", "user": { "profile": { "email": "samuel@pydantic.dev", "first_name": "Samuel", "last_name": "Colvin", "display_name": "Samuel Colvin" } } } }' ``` Deploying to production If you'd like to deploy this app into your Modal workspace in a persistent fashion, you can use this command: ```bash python/uv-run -m modal deploy -m pydantic_ai_examples.slack_lead_qualifier.modal ``` You'll likely want to [download the code](https://github.com/pydantic/pydantic-ai/tree/main/examples/pydantic_ai_examples/slack_lead_qualifier) first, put it in a new repo, and then do [continuous deployment](https://modal.com/docs/guide/continuous-deployment#github-actions) using GitHub Actions. Don't forget to update the Slack event request URL to the new persistent URL! You'll also want to modify the [instructions for the agent](#agent) to your own situation. ## The code We're going to start with the basics, and then gradually build up into the full app. ### Models #### `Profile` First, we define a [Pydantic](https://docs.pydantic.dev) model that represents a Slack user profile. These are the fields we get from the [`team_join`](https://docs.slack.dev/reference/events/team_join) event that's sent to the webhook endpoint that we'll define in a bit. [slack_lead_qualifier/models.py (L11-L15)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/models.py#L11-L15) ```py ... class Profile(BaseModel): first_name: str | None = None last_name: str | None = None display_name: str | None = None email: str ... ``` We also define a `Profile.as_prompt()` helper method that uses format_as_xml to turn the profile into a string that can be sent to the model. [slack_lead_qualifier/models.py (L7-L19)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/models.py#L7-L19) ```py ... from pydantic_ai import format_as_xml ... class Profile(BaseModel): ... def as_prompt(self) -> str: return format_as_xml(self, root_tag='profile') ... ``` #### `Analysis` The second model we'll need represents the result of the analysis that the agent will perform. We include docstrings to provide additional context to the model on what these fields should contain. [slack_lead_qualifier/models.py (L23-L31)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/models.py#L23-L31) ```py ... class Analysis(BaseModel): profile: Profile organization_name: str organization_domain: str job_title: str relevance: Annotated[int, Ge(1), Le(5)] """Estimated fit for Pydantic Logfire: 1 = low, 5 = high""" summary: str """One-sentence welcome note summarising who they are and how we might help""" ... ``` We also define a `Analysis.as_slack_blocks()` helper method that turns the analysis into some [Slack blocks](https://api.slack.com/reference/block-kit/blocks) that can be sent to the Slack API to post a new message. [slack_lead_qualifier/models.py (L23-L46)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/models.py#L23-L46) ```py ... class Analysis(BaseModel): ... def as_slack_blocks(self, include_relevance: bool = False) -> list[dict[str, Any]]: profile = self.profile relevance = f'({self.relevance}/5)' if include_relevance else '' return [ { 'type': 'markdown', 'text': f'[{profile.display_name}](mailto:{profile.email}), {self.job_title} at [**{self.organization_name}**](https://{self.organization_domain}) {relevance}', }, { 'type': 'markdown', 'text': self.summary, }, ] ``` ### Agent Now it's time to get into Pydantic AI and define the agent that will do the actual analysis! We specify the model we'll use (`openai:gpt-4o`), provide [instructions](../../agents/#instructions), give the agent access to the [DuckDuckGo search tool](../../common-tools/#duckduckgo-search-tool), and tell it to output either an `Analysis` or `None` using the [Native Output](../../output/#native-output) structured output mode. The real meat of the app is in the instructions that tell the agent how to evaluate each new Slack member. If you plan to use this app yourself, you'll of course want to modify them to your own situation. [slack_lead_qualifier/agent.py (L7-L40)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/agent.py#L7-L40) ```py ... from pydantic_ai import Agent, NativeOutput from pydantic_ai.common_tools.duckduckgo import duckduckgo_search_tool ... agent = Agent( 'openai:gpt-4o', instructions=dedent( """ When a new person joins our public Slack, please put together a brief snapshot so we can be most useful to them. **What to include** 1. **Who they are:** Any details about their professional role or projects (e.g. LinkedIn, GitHub, company bio). 2. **Where they work:** Name of the organisation and its domain. 3. **How we can help:** On a scale of 1–5, estimate how likely they are to benefit from **Pydantic Logfire** (our paid observability tool) based on factors such as company size, product maturity, or AI usage. *1 = probably not relevant, 5 = very strong fit.* **Our products (for context only)** • **Pydantic Validation** – Python data-validation (open source) • **Pydantic AI** – Python agent framework (open source) • **Pydantic Logfire** – Observability for traces, logs & metrics with first-class AI support (commercial) **How to research** • Use the provided DuckDuckGo search tool to research the person and the organization they work for, based on the email domain or what you find on e.g. LinkedIn and GitHub. • If you can't find enough to form a reasonable view, return **None**. """ ), tools=[duckduckgo_search_tool()], output_type=NativeOutput([Analysis, NoneType]), ) ... ``` #### `analyze_profile` We also define a `analyze_profile` helper function that takes a `Profile`, runs the agent, and returns an `Analysis` (or `None`), and instrument it using [Logfire](../../logfire/). [slack_lead_qualifier/agent.py (L44-L47)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/agent.py#L44-L47) ```py ... @logfire.instrument('Analyze profile') async def analyze_profile(profile: Profile) -> Analysis | None: result = await agent.run(profile.as_prompt()) return result.output ``` ### Analysis store The next building block we'll need is a place to store all the analyses that have been done so that we can look them up when we send the daily summary. Fortunately, Modal provides us with a convenient way to store some data that can be read back in a subsequent Modal run (webhook or scheduled): [`modal.Dict`](https://modal.com/docs/reference/modal.Dict). We define some convenience methods to easily add, list, and clear analyses. [slack_lead_qualifier/store.py (L4-L31)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/store.py#L4-L31) ```py ... import modal ... class AnalysisStore: @classmethod @logfire.instrument('Add analysis to store') async def add(cls, analysis: Analysis): await cls._get_store().put.aio(analysis.profile.email, analysis.model_dump()) @classmethod @logfire.instrument('List analyses from store') async def list(cls) -> list[Analysis]: return [ Analysis.model_validate(analysis) async for analysis in cls._get_store().values.aio() ] @classmethod @logfire.instrument('Clear analyses from store') async def clear(cls): await cls._get_store().clear.aio() @classmethod def _get_store(cls) -> modal.Dict: return modal.Dict.from_name('analyses', create_if_missing=True) # type: ignore ``` Note Note that `# type: ignore` on the last line -- unfortunately `modal` does not fully define its types, so we need this to stop our static type checker `pyright`, which we run over all Pydantic AI code including examples, from complaining. ### Send Slack message Next, we'll need a way to actually send a Slack message, so we define a simple function that uses Slack's [`chat.postMessage`](https://api.slack.com/methods/chat.postMessage) API. [slack_lead_qualifier/slack.py (L8-L30)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/slack.py#L8-L30) ```py ... API_KEY = os.getenv('SLACK_API_KEY') assert API_KEY, 'SLACK_API_KEY is not set' @logfire.instrument('Send Slack message') async def send_slack_message(channel: str, blocks: list[dict[str, Any]]): client = httpx.AsyncClient() response = await client.post( 'https://slack.com/api/chat.postMessage', json={ 'channel': channel, 'blocks': blocks, }, headers={ 'Authorization': f'Bearer {API_KEY}', }, timeout=5, ) response.raise_for_status() result = response.json() if not result.get('ok', False): error = result.get('error', 'Unknown error') raise Exception(f'Failed to send to Slack: {error}') ``` ### Features Now we can start putting these building blocks together to implement the actual features we want! #### `process_slack_member` This function takes a [`Profile`](#profile), [analyzes](#analyze_profile) it using the agent, adds it to the [`AnalysisStore`](#analysis-store), and [sends](#send-slack-message) the analysis into the `#new-slack-leads` channel. [slack_lead_qualifier/functions.py (L4-L45)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/functions.py#L4-L45) ```py ... from .agent import analyze_profile from .models import Profile from .slack import send_slack_message from .store import AnalysisStore ... NEW_LEAD_CHANNEL = '#new-slack-leads' ... @logfire.instrument('Process Slack member') async def process_slack_member(profile: Profile): analysis = await analyze_profile(profile) logfire.info('Analysis', analysis=analysis) if analysis is None: return await AnalysisStore().add(analysis) await send_slack_message( NEW_LEAD_CHANNEL, [ { 'type': 'header', 'text': { 'type': 'plain_text', 'text': f'New Slack member with score {analysis.relevance}/5', }, }, { 'type': 'divider', }, *analysis.as_slack_blocks(), ], ) ... ``` #### `send_daily_summary` This function list all of the analyses in the [`AnalysisStore`](#analysis-store), takes the top 5 by relevance, [sends](#send-slack-message) them into the `#daily-slack-leads-summary` channel, and clears the `AnalysisStore` so that the next daily run won't process these analyses again. [slack_lead_qualifier/functions.py (L8-L85)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/functions.py#L8-L85) ```py ... from .slack import send_slack_message from .store import AnalysisStore ... DAILY_SUMMARY_CHANNEL = '#daily-slack-leads-summary' ... @logfire.instrument('Send daily summary') async def send_daily_summary(): analyses = await AnalysisStore().list() logfire.info('Analyses', analyses=analyses) if len(analyses) == 0: return sorted_analyses = sorted(analyses, key=lambda x: x.relevance, reverse=True) top_analyses = sorted_analyses[:5] blocks = [ { 'type': 'header', 'text': { 'type': 'plain_text', 'text': f'Top {len(top_analyses)} new Slack members from the last 24 hours', }, }, ] for analysis in top_analyses: blocks.extend( [ { 'type': 'divider', }, *analysis.as_slack_blocks(include_relevance=True), ] ) await send_slack_message( DAILY_SUMMARY_CHANNEL, blocks, ) await AnalysisStore().clear() ``` ### Web app As it stands, neither of these functions are actually being called from anywhere. Let's implement a [FastAPI](https://fastapi.tiangolo.com/) endpoint to handle the `team_join` Slack webhook (also known as the [Slack Events API](https://docs.slack.dev/apis/events-api)) and call the [`process_slack_member`](#process_slack_member) function we just defined. We also instrument FastAPI using Logfire for good measure. [slack_lead_qualifier/app.py (L20-L36)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/app.py#L20-L36) ```py ... app = FastAPI() logfire.instrument_fastapi(app, capture_headers=True) @app.post('/') async def process_webhook(payload: dict[str, Any]) -> dict[str, Any]: if payload['type'] == 'url_verification': return {'challenge': payload['challenge']} elif ( payload['type'] == 'event_callback' and payload['event']['type'] == 'team_join' ): profile = Profile.model_validate(payload['event']['user']['profile']) process_slack_member(profile) return {'status': 'OK'} raise HTTPException(status_code=status.HTTP_422_UNPROCESSABLE_ENTITY) ``` #### `process_slack_member` with Modal I was a little sneaky there -- we're not actually calling the [`process_slack_member`](#process_slack_member) function we defined in `functions.py` directly, as Slack requires webhooks to respond within 3 seconds, and we need a bit more time than that to talk to the LLM, do some web searches, and send the Slack message. Instead, we're calling the following function defined alongside the app, which uses Modal's [`modal.Function.spawn`](https://modal.com/docs/reference/modal.Function#spawn) feature to run a function in the background. (If you're curious what the Modal side of this function looks like, you can [jump ahead](#backgrounded-process_slack_member).) Because `modal.py` (which we'll see in the next section) imports `app.py`, we import from `modal.py` inside the function definition because doing so at the top level would have resulted in a circular import error. We also pass along the current Logfire context to get [Distributed Tracing](https://logfire.pydantic.dev/docs/how-to-guides/distributed-tracing/), meaning that the background function execution will show up nested under the webhook request trace, so that we have everything related to that request in one place. [slack_lead_qualifier/app.py (L11-L16)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/app.py#L11-L16) ```py ... def process_slack_member(profile: Profile): from .modal import process_slack_member as _process_slack_member _process_slack_member.spawn( profile.model_dump(), logfire_ctx=get_context() ) ... ``` ### Modal app Now let's see how easy Modal makes it to deploy all of this. #### Set up Modal The first thing we do is define the Modal app, by specifying the base image to use (Debian with Python 3.13), all the Python packages it needs, and all of the secrets defined in the Modal interface that need to be made available during runtime. [slack_lead_qualifier/modal.py (L4-L21)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/modal.py#L4-L21) ```py ... import modal image = modal.Image.debian_slim(python_version='3.13').pip_install( 'pydantic', 'pydantic_ai_slim[openai,duckduckgo]', 'logfire[httpx,fastapi]', 'fastapi[standard]', 'httpx', ) app = modal.App( name='slack-lead-qualifier', image=image, secrets=[ modal.Secret.from_name('logfire'), modal.Secret.from_name('openai'), modal.Secret.from_name('slack'), ], ) ... ``` #### Set up Logfire Next, we define a function to set up Logfire instrumentation for Pydantic AI and HTTPX. We cannot do this at the top level of the file, as the requested packages (like `logfire`) will only be available within functions running on Modal (like the ones we'll define next). This file, `modal.py`, runs on your local machine and only has access to the `modal` package. [slack_lead_qualifier/modal.py (L25-L30)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/modal.py#L25-L30) ```py ... def setup_logfire(): import logfire logfire.configure(service_name=app.name) logfire.instrument_pydantic_ai() logfire.instrument_httpx(capture_all=True) ... ``` #### Web app To deploy a [web endpoint](https://modal.com/docs/guide/webhooks) on Modal, we simply define a function that returns an ASGI app (like FastAPI) and decorate it with `@app.function()` and `@modal.asgi_app()`. This `web_app` function will be run on Modal, so inside the function we can call the `setup_logfire` function that requires the `logfire` package, and import `app.py` which uses the other requested packages. By default, Modal spins up a container to handle a function call (like a web request) on-demand, meaning there's a little bit of startup time to each request. However, Slack requires webhooks to respond within 3 seconds, so we specify `min_containers=1` to keep the web endpoint running and ready to answer requests at all times. This is a bit annoying and wasteful, but fortunately [Modal's pricing](https://modal.com/pricing) is pretty reasonable, you get $30 free monthly compute, and they offer up to $50k in free credits for startup and academic researchers. [slack_lead_qualifier/modal.py (L34-L41)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/modal.py#L34-L41) ```py ... @app.function(min_containers=1) @modal.asgi_app() # type: ignore def web_app(): setup_logfire() from .app import app as _app return _app ... ``` Note Note that `# type: ignore` on the `@modal.asgi_app()` line -- unfortunately `modal` does not fully define its types, so we need this to stop our static type checker `pyright`, which we run over all Pydantic AI code including examples, from complaining. #### Scheduled `send_daily_summary` To define a [scheduled function](https://modal.com/docs/guide/cron), we can use the `@app.function()` decorator with a `schedule` argument. This Modal function will call our imported [`send_daily_summary`](#send_daily_summary) function every day at 8 am UTC. [slack_lead_qualifier/modal.py (L60-L66)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/modal.py#L60-L66) ```py ... @app.function(schedule=modal.Cron('0 8 * * *')) # Every day at 8am UTC async def send_daily_summary(): setup_logfire() from .functions import send_daily_summary as _send_daily_summary await _send_daily_summary() ``` #### Backgrounded `process_slack_member` Finally, we define a Modal function that wraps our [`process_slack_member`](#process_slack_member) function, so that it can run in the background. As you'll remember from when we [spawned this function from the web app](#process_slack_member-with-modal), we passed along the Logfire context to get [Distributed Tracing](https://logfire.pydantic.dev/docs/how-to-guides/distributed-tracing/), so we need to attach it here. [slack_lead_qualifier/modal.py (L45-L56)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/modal.py#L45-L56) ```py ... @app.function() async def process_slack_member(profile_raw: dict[str, Any], logfire_ctx: Any): setup_logfire() from logfire.propagate import attach_context from .functions import process_slack_member as _process_slack_member from .models import Profile with attach_context(logfire_ctx): profile = Profile.model_validate(profile_raw) await _process_slack_member(profile) ... ``` ## Conclusion And that's it! Now, assuming you've met the [prerequisites](#prerequisites), you can run or deploy the app using the commands under [usage](#usage). # SQL Generation Example demonstrating how to use Pydantic AI to generate SQL queries based on user input. Demonstrates: - [dynamic system prompt](../../agents/#system-prompts) - [structured `output_type`](../../output/#structured-output) - [output validation](../../output/#output-validator-functions) - [agent dependencies](../../dependencies/) ## Running the Example The resulting SQL is validated by running it as an `EXPLAIN` query on PostgreSQL. To run the example, you first need to run PostgreSQL, e.g. via Docker: ```bash docker run --rm -e POSTGRES_PASSWORD=postgres -p 54320:5432 postgres ``` *(we run postgres on port `54320` to avoid conflicts with any other postgres instances you may have running)* With [dependencies installed and environment variables set](../setup/#usage), run: ```bash python -m pydantic_ai_examples.sql_gen ``` ```bash uv run -m pydantic_ai_examples.sql_gen ``` or to use a custom prompt: ```bash python -m pydantic_ai_examples.sql_gen "find me errors" ``` ```bash uv run -m pydantic_ai_examples.sql_gen "find me errors" ``` This model uses `gemini-1.5-flash` by default since Gemini is good at single shot queries of this kind. ## Example Code [sql_gen.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/sql_gen.py) ```py """Example demonstrating how to use Pydantic AI to generate SQL queries based on user input. Run postgres with: mkdir postgres-data docker run --rm -e POSTGRES_PASSWORD=postgres -p 54320:5432 postgres Run with: uv run -m pydantic_ai_examples.sql_gen "show me logs from yesterday, with level 'error'" """ import asyncio import sys from collections.abc import AsyncGenerator from contextlib import asynccontextmanager from dataclasses import dataclass from datetime import date from typing import Annotated, Any, TypeAlias import asyncpg import logfire from annotated_types import MinLen from devtools import debug from pydantic import BaseModel, Field from pydantic_ai import Agent, ModelRetry, RunContext, format_as_xml # 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured logfire.configure(send_to_logfire='if-token-present') logfire.instrument_asyncpg() logfire.instrument_pydantic_ai() DB_SCHEMA = """ CREATE TABLE records ( created_at timestamptz, start_timestamp timestamptz, end_timestamp timestamptz, trace_id text, span_id text, parent_span_id text, level log_level, span_name text, message text, attributes_json_schema text, attributes jsonb, tags text[], is_exception boolean, otel_status_message text, service_name text ); """ SQL_EXAMPLES = [ { 'request': 'show me records where foobar is false', 'response': "SELECT * FROM records WHERE attributes->>'foobar' = false", }, { 'request': 'show me records where attributes include the key "foobar"', 'response': "SELECT * FROM records WHERE attributes ? 'foobar'", }, { 'request': 'show me records from yesterday', 'response': "SELECT * FROM records WHERE start_timestamp::date > CURRENT_TIMESTAMP - INTERVAL '1 day'", }, { 'request': 'show me error records with the tag "foobar"', 'response': "SELECT * FROM records WHERE level = 'error' and 'foobar' = ANY(tags)", }, ] @dataclass class Deps: conn: asyncpg.Connection class Success(BaseModel): """Response when SQL could be successfully generated.""" sql_query: Annotated[str, MinLen(1)] explanation: str = Field( '', description='Explanation of the SQL query, as markdown' ) class InvalidRequest(BaseModel): """Response the user input didn't include enough information to generate SQL.""" error_message: str Response: TypeAlias = Success | InvalidRequest agent = Agent[Deps, Response]( 'google-gla:gemini-1.5-flash', # Type ignore while we wait for PEP-0747, nonetheless unions will work fine everywhere else output_type=Response, # type: ignore deps_type=Deps, ) @agent.system_prompt async def system_prompt() -> str: return f"""\ Given the following PostgreSQL table of records, your job is to write a SQL query that suits the user's request. Database schema: {DB_SCHEMA} today's date = {date.today()} {format_as_xml(SQL_EXAMPLES)} """ @agent.output_validator async def validate_output(ctx: RunContext[Deps], output: Response) -> Response: if isinstance(output, InvalidRequest): return output # gemini often adds extraneous backslashes to SQL output.sql_query = output.sql_query.replace('\\', '') if not output.sql_query.upper().startswith('SELECT'): raise ModelRetry('Please create a SELECT query') try: await ctx.deps.conn.execute(f'EXPLAIN {output.sql_query}') except asyncpg.exceptions.PostgresError as e: raise ModelRetry(f'Invalid query: {e}') from e else: return output async def main(): if len(sys.argv) == 1: prompt = 'show me logs from yesterday, with level "error"' else: prompt = sys.argv[1] async with database_connect( 'postgresql://postgres:postgres@localhost:54320', 'pydantic_ai_sql_gen' ) as conn: deps = Deps(conn) result = await agent.run(prompt, deps=deps) debug(result.output) # pyright: reportUnknownMemberType=false # pyright: reportUnknownVariableType=false @asynccontextmanager async def database_connect(server_dsn: str, database: str) -> AsyncGenerator[Any, None]: with logfire.span('check and create DB'): conn = await asyncpg.connect(server_dsn) try: db_exists = await conn.fetchval( 'SELECT 1 FROM pg_database WHERE datname = $1', database ) if not db_exists: await conn.execute(f'CREATE DATABASE {database}') finally: await conn.close() conn = await asyncpg.connect(f'{server_dsn}/{database}') try: with logfire.span('create schema'): async with conn.transaction(): if not db_exists: await conn.execute( "CREATE TYPE log_level AS ENUM ('debug', 'info', 'warning', 'error', 'critical')" ) await conn.execute(DB_SCHEMA) yield conn finally: await conn.close() if __name__ == '__main__': asyncio.run(main()) ``` This example shows how to stream markdown from an agent, using the [`rich`](https://github.com/Textualize/rich) library to highlight the output in the terminal. It'll run the example with both OpenAI and Google Gemini models if the required environment variables are set. Demonstrates: - [streaming text responses](../../output/#streaming-text) ## Running the Example With [dependencies installed and environment variables set](../setup/#usage), run: ```bash python -m pydantic_ai_examples.stream_markdown ``` ```bash uv run -m pydantic_ai_examples.stream_markdown ``` ## Example Code [stream_markdown.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/stream_markdown.py) ```py """This example shows how to stream markdown from an agent, using the `rich` library to display the markdown. Run with: uv run -m pydantic_ai_examples.stream_markdown """ import asyncio import os import logfire from rich.console import Console, ConsoleOptions, RenderResult from rich.live import Live from rich.markdown import CodeBlock, Markdown from rich.syntax import Syntax from rich.text import Text from pydantic_ai import Agent from pydantic_ai.models import KnownModelName # 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured logfire.configure(send_to_logfire='if-token-present') logfire.instrument_pydantic_ai() agent = Agent() # models to try, and the appropriate env var models: list[tuple[KnownModelName, str]] = [ ('google-gla:gemini-2.0-flash', 'GEMINI_API_KEY'), ('openai:gpt-4o-mini', 'OPENAI_API_KEY'), ('groq:llama-3.3-70b-versatile', 'GROQ_API_KEY'), ] async def main(): prettier_code_blocks() console = Console() prompt = 'Show me a short example of using Pydantic.' console.log(f'Asking: {prompt}...', style='cyan') for model, env_var in models: if env_var in os.environ: console.log(f'Using model: {model}') with Live('', console=console, vertical_overflow='visible') as live: async with agent.run_stream(prompt, model=model) as result: async for message in result.stream_output(): live.update(Markdown(message)) console.log(result.usage()) else: console.log(f'{model} requires {env_var} to be set.') def prettier_code_blocks(): """Make rich code blocks prettier and easier to copy. From https://github.com/samuelcolvin/aicli/blob/v0.8.0/samuelcolvin_aicli.py#L22 """ class SimpleCodeBlock(CodeBlock): def __rich_console__( self, console: Console, options: ConsoleOptions ) -> RenderResult: code = str(self.text).rstrip() yield Text(self.lexer_name, style='dim') yield Syntax( code, self.lexer_name, theme=self.theme, background_color='default', word_wrap=True, ) yield Text(f'/{self.lexer_name}', style='dim') Markdown.elements['fence'] = SimpleCodeBlock if __name__ == '__main__': asyncio.run(main()) ``` Information about whales — an example of streamed structured response validation. Demonstrates: - [streaming structured output](../../output/#streaming-structured-output) This script streams structured responses from GPT-4 about whales, validates the data and displays it as a dynamic table using [`rich`](https://github.com/Textualize/rich) as the data is received. ## Running the Example With [dependencies installed and environment variables set](../setup/#usage), run: ```bash python -m pydantic_ai_examples.stream_whales ``` ```bash uv run -m pydantic_ai_examples.stream_whales ``` Should give an output like this: ## Example Code [stream_whales.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/stream_whales.py) ```py """Information about whales — an example of streamed structured response validation. This script streams structured responses from GPT-4 about whales, validates the data and displays it as a dynamic table using Rich as the data is received. Run with: uv run -m pydantic_ai_examples.stream_whales """ from typing import Annotated import logfire from pydantic import Field from rich.console import Console from rich.live import Live from rich.table import Table from typing_extensions import NotRequired, TypedDict from pydantic_ai import Agent # 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured logfire.configure(send_to_logfire='if-token-present') logfire.instrument_pydantic_ai() class Whale(TypedDict): name: str length: Annotated[ float, Field(description='Average length of an adult whale in meters.') ] weight: NotRequired[ Annotated[ float, Field(description='Average weight of an adult whale in kilograms.', ge=50), ] ] ocean: NotRequired[str] description: NotRequired[Annotated[str, Field(description='Short Description')]] agent = Agent('openai:gpt-4', output_type=list[Whale]) async def main(): console = Console() with Live('\n' * 36, console=console) as live: console.print('Requesting data...', style='cyan') async with agent.run_stream( 'Generate me details of 5 species of Whale.' ) as result: console.print('Response:', style='green') async for whales in result.stream_output(debounce_by=0.01): table = Table( title='Species of Whale', caption='Streaming Structured responses from GPT-4', width=120, ) table.add_column('ID', justify='right') table.add_column('Name') table.add_column('Avg. Length (m)', justify='right') table.add_column('Avg. Weight (kg)', justify='right') table.add_column('Ocean') table.add_column('Description', justify='right') for wid, whale in enumerate(whales, start=1): table.add_row( str(wid), whale['name'], f'{whale["length"]:0.0f}', f'{w:0.0f}' if (w := whale.get('weight')) else '…', whale.get('ocean') or '…', whale.get('description') or '…', ) live.update(table) if __name__ == '__main__': import asyncio asyncio.run(main()) ``` Example of Pydantic AI with multiple tools which the LLM needs to call in turn to answer a question. Demonstrates: - [tools](../../tools/) - [agent dependencies](../../dependencies/) - [streaming text responses](../../output/#streaming-text) - Building a [Gradio](https://www.gradio.app/) UI for the agent In this case the idea is a "weather" agent — the user can ask for the weather in multiple locations, the agent will use the `get_lat_lng` tool to get the latitude and longitude of the locations, then use the `get_weather` tool to get the weather for those locations. ## Running the Example To run this example properly, you might want to add two extra API keys **(Note if either key is missing, the code will fall back to dummy data, so they're not required)**: - A weather API key from [tomorrow.io](https://www.tomorrow.io/weather-api/) set via `WEATHER_API_KEY` - A geocoding API key from [geocode.maps.co](https://geocode.maps.co/) set via `GEO_API_KEY` With [dependencies installed and environment variables set](../setup/#usage), run: ```bash python -m pydantic_ai_examples.weather_agent ``` ```bash uv run -m pydantic_ai_examples.weather_agent ``` ## Example Code [weather_agent.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/weather_agent.py) ```py """Example of Pydantic AI with multiple tools which the LLM needs to call in turn to answer a question. In this case the idea is a "weather" agent — the user can ask for the weather in multiple cities, the agent will use the `get_lat_lng` tool to get the latitude and longitude of the locations, then use the `get_weather` tool to get the weather. Run with: uv run -m pydantic_ai_examples.weather_agent """ from __future__ import annotations as _annotations import asyncio from dataclasses import dataclass from typing import Any import logfire from httpx import AsyncClient from pydantic import BaseModel from pydantic_ai import Agent, RunContext # 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured logfire.configure(send_to_logfire='if-token-present') logfire.instrument_pydantic_ai() @dataclass class Deps: client: AsyncClient weather_agent = Agent( 'openai:gpt-4.1-mini', # 'Be concise, reply with one sentence.' is enough for some models (like openai) to use # the below tools appropriately, but others like anthropic and gemini require a bit more direction. instructions='Be concise, reply with one sentence.', deps_type=Deps, retries=2, ) class LatLng(BaseModel): lat: float lng: float @weather_agent.tool async def get_lat_lng(ctx: RunContext[Deps], location_description: str) -> LatLng: """Get the latitude and longitude of a location. Args: ctx: The context. location_description: A description of a location. """ # NOTE: the response here will be random, and is not related to the location description. r = await ctx.deps.client.get( 'https://demo-endpoints.pydantic.workers.dev/latlng', params={'location': location_description}, ) r.raise_for_status() return LatLng.model_validate_json(r.content) @weather_agent.tool async def get_weather(ctx: RunContext[Deps], lat: float, lng: float) -> dict[str, Any]: """Get the weather at a location. Args: ctx: The context. lat: Latitude of the location. lng: Longitude of the location. """ # NOTE: the responses here will be random, and are not related to the lat and lng. temp_response, descr_response = await asyncio.gather( ctx.deps.client.get( 'https://demo-endpoints.pydantic.workers.dev/number', params={'min': 10, 'max': 30}, ), ctx.deps.client.get( 'https://demo-endpoints.pydantic.workers.dev/weather', params={'lat': lat, 'lng': lng}, ), ) temp_response.raise_for_status() descr_response.raise_for_status() return { 'temperature': f'{temp_response.text} °C', 'description': descr_response.text, } async def main(): async with AsyncClient() as client: logfire.instrument_httpx(client, capture_all=True) deps = Deps(client=client) result = await weather_agent.run( 'What is the weather like in London and in Wiltshire?', deps=deps ) print('Response:', result.output) if __name__ == '__main__': asyncio.run(main()) ``` ## Running the UI You can build multi-turn chat applications for your agent with [Gradio](https://www.gradio.app/), a framework for building AI web applications entirely in python. Gradio comes with built-in chat components and agent support so the entire UI will be implemented in a single python file! Here's what the UI looks like for the weather agent: ```bash pip install gradio>=5.9.0 python/uv-run -m pydantic_ai_examples.weather_agent_gradio ``` ## UI Code [weather_agent_gradio.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/weather_agent_gradio.py) ```py from __future__ import annotations as _annotations import json from httpx import AsyncClient from pydantic import BaseModel from pydantic_ai.messages import ToolCallPart, ToolReturnPart from pydantic_ai_examples.weather_agent import Deps, weather_agent try: import gradio as gr except ImportError as e: raise ImportError( 'Please install gradio with `pip install gradio`. You must use python>=3.10.' ) from e TOOL_TO_DISPLAY_NAME = {'get_lat_lng': 'Geocoding API', 'get_weather': 'Weather API'} client = AsyncClient() deps = Deps(client=client) async def stream_from_agent(prompt: str, chatbot: list[dict], past_messages: list): chatbot.append({'role': 'user', 'content': prompt}) yield gr.Textbox(interactive=False, value=''), chatbot, gr.skip() async with weather_agent.run_stream( prompt, deps=deps, message_history=past_messages ) as result: for message in result.new_messages(): for call in message.parts: if isinstance(call, ToolCallPart): call_args = call.args_as_json_str() metadata = { 'title': f'🛠️ Using {TOOL_TO_DISPLAY_NAME[call.tool_name]}', } if call.tool_call_id is not None: metadata['id'] = call.tool_call_id gr_message = { 'role': 'assistant', 'content': 'Parameters: ' + call_args, 'metadata': metadata, } chatbot.append(gr_message) if isinstance(call, ToolReturnPart): for gr_message in chatbot: if ( gr_message.get('metadata', {}).get('id', '') == call.tool_call_id ): if isinstance(call.content, BaseModel): json_content = call.content.model_dump_json() else: json_content = json.dumps(call.content) gr_message['content'] += f'\nOutput: {json_content}' yield gr.skip(), chatbot, gr.skip() chatbot.append({'role': 'assistant', 'content': ''}) async for message in result.stream_text(): chatbot[-1]['content'] = message yield gr.skip(), chatbot, gr.skip() past_messages = result.all_messages() yield gr.Textbox(interactive=True), gr.skip(), past_messages async def handle_retry(chatbot, past_messages: list, retry_data: gr.RetryData): new_history = chatbot[: retry_data.index] previous_prompt = chatbot[retry_data.index]['content'] past_messages = past_messages[: retry_data.index] async for update in stream_from_agent(previous_prompt, new_history, past_messages): yield update def undo(chatbot, past_messages: list, undo_data: gr.UndoData): new_history = chatbot[: undo_data.index] past_messages = past_messages[: undo_data.index] return chatbot[undo_data.index]['content'], new_history, past_messages def select_data(message: gr.SelectData) -> str: return message.value['text'] with gr.Blocks() as demo: gr.HTML( """

Weather Assistant

This assistant answer your weather questions.

""" ) past_messages = gr.State([]) chatbot = gr.Chatbot( label='Packing Assistant', type='messages', avatar_images=(None, 'https://ai.pydantic.dev/img/logo-white.svg'), examples=[ {'text': 'What is the weather like in Miami?'}, {'text': 'What is the weather like in London?'}, ], ) with gr.Row(): prompt = gr.Textbox( lines=1, show_label=False, placeholder='What is the weather like in New York City?', ) generation = prompt.submit( stream_from_agent, inputs=[prompt, chatbot, past_messages], outputs=[prompt, chatbot, past_messages], ) chatbot.example_select(select_data, None, [prompt]) chatbot.retry( handle_retry, [chatbot, past_messages], [prompt, chatbot, past_messages] ) chatbot.undo(undo, [chatbot, past_messages], [prompt, chatbot, past_messages]) if __name__ == '__main__': demo.launch() ```